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Preface 

Data  correlation  has  not  been  explicitly  addressed  In  any  past 
studies  which  develop  aircraft  airframe  cost  estimating  relationships. 

The  Components  of  Variance  model  can  recognise  multiple  sources  of  error 
and  when  adapted  to  airframe  cost  estimation  can  explicitly  deal  with 
this  correlation  problem  and  Improve  the  predictive  qualities  of  the 
resulting  cost  estimating  relationship.  The  Introduction,  Chapter  I, 
and  Conclusions,  Chapter  VI,  outline  the  thrust  and  results  of  this 
thesis  In  a non-mathematlcal  manner.  The  development  and  explanation  of 
the  techniques  employed  along  with  the  specific  results  obtained  are 
presented  In  the  remaining  chapters. 

My  sincere  thanks  and  deep  appreciation  must  be  extended  to  Or. 

N.  Keith  Womer,  »v  thesis  advisor,  tor  his  patience,  support,  and  assist- 
ance In  this  effort.  Also,  without  the  great  understanding  and  sacrifice 
of  my  wife,  Ruth,  this  entire  study  would  not  have  been  possible. 
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Abstract 


' Previous  studies  into  airframe  acquisition  cost  estimation  do  not 
explicitly  recognise  the  existence  of  correlation  In  the  historical  data. 
If  one  believes  this  data  problem  exists,  then  it  is  possible  to  develop 
a components  of  variance  model  that  takes  the  problem  Into  account.  It 
is  a more  general  model  that  recognises  two  sources  of  error:  (1)  error 

due  to  different  types  of  airframes  and  (2)  overall  or  ordinary  regres- 


sion error.  The  variance  of  these  two  errors  can  be  estimated  and  then 
can  be  utilised  along  with  the  technique  of  generalized  least  squares  to 
obtain  a cost  estimating  relationship  which  explicitly  accounts  for  the 
data  correlation.  This  modeling  technique,  when  compared  to  techniques 
presently  In  service,  shows  that  present  estimating  relationships  under- 
estimate the  variance  of  the  cost  prediction  of  a new  tvpe  airframe  and 
overestimate  the  variance  of  the  cost  prediction  of  a follow-on  airframe. 
Also,  those  existing  techniques  which  implicitly  recognize  data  correla- 
tion do  not  make  use  of  all  the  data  information  available  and  therefore 
produce  estimates  with  poor  confidence/prediction  intervals.  The 
modeling  technique  developed  here  is  an  improvement  over  thel\resent 


techniques  utilized  and  advances  the  state  of  the  art  of  paramedic 
airframe  cost  estimation  greatly. 
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I . Introduction 


It  Is  becoming  more  and  more  Important  In  industry  and  especially 
in  the  Department  of  Defense  (DoD)  to  obtain  good  cost  estimates  of 
newly  conceptualized  products  and  weapon  systems.  DoD  has  been  under 
ever  Increasing  pressure  from  Congress  and  the  American  public  to 
justify  the  choices  and  costs  of  new  systems. 

One  of  the  more  costly  items  that  DoD  buys  is  aircraft.  Aircraft 
contracts  run  into  billions  of  dollars  and  good  cost  estimation  is  essen- 
tial when  attempting  to  obtain  Congressional  approval  for  an  aircraft 
program. 

In  the  last  ten  years  the  parametric  statistical  approach  to 
developing  cost  estimating  relationships  (CER's)  to  predict  costs  has 
received  a great  deal  of  emphasis  and  use.  This  is  especially  true  in 
the  case  of  predicting  aircraft  airframe  acquisition  costs.  This  paper 
will  only  address  this  particular  area  of  parametric  cost  estimation, 
that  is,  aircraft  airframe  acquisition  costs.  An  airframe  is  the  body/ 
frame  of  an  aircraft  without  engines,  avionics,  wheels,  etc.  An  air- 
craft contract  is  the  agreement  with  the  company  picked  to  produce  the 
aircraft.  A "lot"  of  aircraft  is  what  is  delivered  to  DoD  and  this  is 
ordinarily  one  year's  buy. 
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MltlflBa  Coats 

There  are  many  approaches  to  parametric  airframe  coat  estimation. 
(Throughout  the  remainder  of  thla  study  the  term  cost  will  refer  to 
acquisition  cost  only  and  airframe  acquisition  cost  will  be  the  only 
Item  discussed.)  The  cost  to  be  est lawted  could  be  the  total  cost  of 
an  airframe  contract  (lot).  Conversely,  the  costs  to  be  addressed  could 
be  the  separate  cost  elements  Included  In  the  total  lot  cost,  l.e., 
tooling  cost,  material  cost,  labor  cost,  engineering  cost,  etc.  The 
elemental  cost  approach  necessitates  the  development  of  a CER  for  each 
cost  element.  Then  to  estimate  the  total  lot  cost,  the  cost  of  each 
element  Is  estimated  and  these  are  added  together. 

A recent  study  by  RAND  Corporation  reports  that  neither  the  total 
lot  cost  nor  elemental  lot  cost  approach  results  In  superior  accuracy 
(Ref  13:3-6).  Therefore,  because  the  elemental  cost  approach  Is  able 
to  point  out  to  LoO  the  more  expensive  portions  of  cost.  It  Is  the 
approach  most  often  utilised.  This  report  will  also  use  the  elemental 
approach  for  the  reasons  given  above  and  to  facilitate  comparison  with 
prior  work  in  this  area  of  cost  estimation. 


Problssw  with  Parametric  Estimation 

The  RAND  Corporation  has  been  the  pacesetter  in  the  area  of  para- 
metric airframe  cost  estimation.  The  different  reports  have  approached 
the  CER  development  in  several  different  ways  and  they  have  studied  many 
of  the  associated  problems. 

<*ie  problem  Is  the  choice  of  the  functional  form  to  be  utilized  for 
the  CER  model.  This  form  could  be  linear,  exponential,  or  log- linear, 
to  name  a few.  Obviously,  the  specific  choice  of  a functional  form  can 
change  the  results  greatly. 
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Another  problem  area  la  tha  choice  of  a scaciatical  technique. 

After  tha  functional  font  Is  chosen  one  must  use  a specific  technique 
with  Its  Inherent  assumptions  to  estimate  the  CER's. 

The  choice  ot  which  Independent  variables  (explanatory  variablas) 
to  use  Is  yet  another  area  that  causes  problems.  In  airframe  cost  astl- 
■at Ion  the  number  of  possible  variables  is  enormous  (from  airframe  unit 
weight  to  the  number  of  wheels,  etc.). 

The  final  problem  associated  with  parametric  cost  estimation  to  be 
mentioned  In  this  report  Is  the  treatment  of  the  data  to  be  utilised. 

One  can  aggregate  it,  separate  it,  throw  some  out,  make  same  up,  and  so 
on.  The  list  of  possible  treatments  Is  endless. 

This  thesis  will  Ignore  all  the  problems  mentioned  above  except  the 
last,  data  treatment.  The  functional  form  of  all  the  CER's  developed  will 
be  log- linear  and  the  only  explanatory  variables  to  be  utilized  will  be 
the  natural  logarithm  of  maximum  aircraft  speed  at  best  altitude  (S),  unit 
airframe  weight  (W)  and  airframe  quantity  (Q). 

Data  Problems 

There  sre  basically  two  ways  of  utilizing  historical  airframe  costs 
to  develop  cost  estimating  relationships.  The  first  method  Is  to  treat 
every  airframe  contract  as  an  observation.  Of  course,  no  entire  Inven- 
tory of  a specific  type  aircraft  is  produced  a6  one  contract.  Normally, 
there  are  several  lots  of  an  aircraft  successively  produced  by  the  same 
contractor.  For  example,  two  recent  RAND  reports,  one  by  Levenson,  et  al. 
and  the  other  by  rimson  and  Tlhansky  use  this  spproach  with  124  observa- 
tions (lots).  This  encompasses  only  26  different  types  of  aircrsft 
(Ref  10;  Ref  13)! 

The  other  approach  to  data  utilization  most  often  used  Is  at  the 
opposite  end  of  the  spectrum  from  the  first.  One  observation  per  air- 
craft type  Is  constructed  (estimated)  from  the  available  contract  data. 
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Dlls  data  point  estimation  Is  most  often  accomplished  by  associating  a 
cost  with  each  aircraft  (airframe)  type  at  same  normalised  and/or 
specified  airframe  quantity.  The  cost  of  this  given  quantity  may  be 
difficult.  If  not  Impossible,  to  obtain  from  the  contractors  so  most 
often  It  Is  derived  by  estimating  the  learning  curve  associated  with 
each  airframe  type  and  then  this  curve  Is  used  to  estimate  the  cost  of, 
for  example,  100  airframes.  Of  course,  a curve  must  be  estimated  for 
each  different  aircraft  type.  RAND  has  studied  the  problem  In  this  way 
several  times  In  the  past  and  two  representative  reports  are  by 
Levenson  & Barro  ind  Large,  et  al.  (Ref  8 ; Ref  9 ). 

There  are  problems  associated  with  each  approach  as  outlined  above. 
There  could  be  a great  deal  of  correlation  between  observations  under 
the  lot  cost  approach.  Womer  suggests  the  possibility  that  lot  costs 
of  the  same  type  airframe  are  highly  correlated  to  one  another  vhen  com* 
pared  to  lot  costs  of  different  or  new  airframes.  Womer  goes  on  to 
Illustrate  this  point  by  graphing  the  standardised  residuals  obtained 
from  a CCR  developed  by  Handel  (Ref  U ) against  one  of  the  explanatory 
variables.  The  CER  was  based  on  37  observations  of  contracts  for  mill* 
tary  fighter/trainer  type  airframes  of  10  different  types  and  Handel 
stated  that  this  relation  has  a coefficient  of  determination  of  .94. 

Fig.  1 Is  a reproduction  of  this  graph  constructed  by  Womer  and  one  can 
see  that  It  suggests  that  there  Is  strong  evidence  of  this  type  of 
correlation  (Ref  16:10-12). 

The  other  data  treatment  would  definitely  be  free  of  the  correla- 
tion problem  previously  mentioned,  but  notice  that  the  number  of 
observations  would  be  drastically  reduced.  For  example,  the  Timson  and 
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Fig.  I.*  Standardized  Residuals  Versus  Weight  (Proa  Ref  16tl2) 
^Multiple  observations  at  the  sane  location  are  indicated  by  nunbers. 


Tihansky  study  used  124  observations  whereas  the  Large,  e£  a_l.  report 
used  only  26  (26  different  type  alrf rases).  This  could  result  In  the 
loss  of  potentially  valuable  historical  cost  Information.  Also,  one 
sust  rananber  that  in  the  Large  study  the  observations  used  were  actually 
estlnatad  or  fictitious  costs. 

Qua  can  see  that  each  of  the  data  approachas  has  its  associated 
problass.  The  Tins on  and  Tihansky  (TT)  study  is  a very  good  representa- 
tive study  of  the  first  treatment  and  the  report  by  Largo*  Campbell  and 


Xh«  hypotheses  of  this  study  srs  that  thoro  Is  corrolatlon  between 
observations  of  different  lots  of  the  sane  typo  airfrasie  and  this  corre- 
lation will  be  assuned  to  bo  caused  by  the  existence  of  two  sources  of 
error  associated  with  airframe  costs.  One  source  of  error  is  due  to 
different  types  of  airframes  and  the  second  is  due  to  overall  data  error 
(ordinary  regress  Ion  error). 

A component  cf  variance  nodel  is  formulated  to  account  for  the  two 
sources  of  error.  This  model  is  shown  to  be  more  general  than  both  the 
linear  models  assumed  by  LAX  and  TT. 

Amplication  o£  Hypotheses 

If  the  hypotheses  and  assumptions  given  above  are  true,  then  present 
coot  estimating  relationships  which  use  IT's  approach  will  underestimate 
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the  variance  of  predicting  the  cost  of  a totally  new  type  of  aircraft 
airframe.  Conversely,  the  existing  CER's  will  also  overestimate  the 
variance  for  the  prediction  of  the  cost  of  a follow-on  lot  of  a 
particular  existing  airframe. 

Secondly,  the  Large,  at  al.  study,  with  Its  possible  loss  of 
Information  should  result  In  confidence  intervals  that  are  larger  than 
necessary.  That  Is,  If  more  Information  la  available  when  estimating 
a CER,  then  the  resulting  confidence  intervals  should  be  tighter. 

This  study  will  Investigate  these  hypotheses  and  thus  detemlne 
If  either  of  the  two  existing  procedures  (LAR  and  IT)  Is  appropriate 
or  whether  a more  general  procedure  is  Indicated. 

Procedure 

This  thesis  will  develop  a technique  to  estimate  the  components  of 
variance  model.  The  technique,  based  on  a method  presented  by  Searle 
(Ref  12  S465-L70)  is  referred  to  as  the  RANDOM  technique. 

If  the  variance  of  the  error  associated  with  different  airframe 
types  is  estlmate<  to  be  zero  or  near  zero,  then  the  TT  approach  would 
be  appropriate  (the  correlation  Is  not  evident).  If  the  estlmatad 
variance  of  the  overall  error  term  Is  zero  and  the  estimated  variance  of 
the  error  between  different  airframes  is  not,  then  the  Large  approach  is 
most  appropriate.  But,  If  both  components  of  variance  are  significant, 
this  indicates  that  neither  the  LAR  nor  TT  approaches  Is  correct.  By 
Implication  the  RANDOM  technique  would  then  be  suggested  or  more  appro- 
priate. 

This  Is  the  task  of  this  study;  that  is,  to  develop  the  RANDOM 
technique  to  estimate  the  components  of  variance  model;  to  apply  the 
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tKhniqu*  together  with  tha  techniques  of  Ul  and  TT  to  tho  bmm  aot  of 
data;  and  to  coaipara  tho  resulting  CER's  with  tha  CAR  and  TT  typo  CER's 

aatlaatad. 


amnUitiaa 

This  atudy  will  flrat  covar  tha  actual  davalopaiant  of  tha  RANDOM 
■odal  estimation  technique  and  lta  axtanalon  to  tha  procadura  of 
generalised  laaat  aquaraa.  Than,  tha  actual  data  aourca  and  ita  naanlng 
will  ba  dlacuaaad  aa  It  la  applicabla  to  all  thraa  nodal lng  tachntquaa 
In  Chaptar  111.  This  study  will  uaa  a data  basa  consisting  of  nlna 
f ightar/tralnar  typa  alrfranos  with  a total  of  33  lots. 

Chaptar  IV  will  outllna  tha  actual  davalopaiant  of  all  thraa  dif- 
farant  sats  of  CER's.  Tha  coat  alanant  CER's  to  ba  aatlaatad  will  ba 
that  of  total  anginaarlng  hours,  total  tooling  hours,  racurrlng  labor 
hours,  and  racurrlng  Material  dollars  (1973).  This  chaptar  will  than 
prasant  tha  darivad  CER's  and  analyse  than. 

The  next  chaptar,  Chaptar  V,  will  address  tha  predictive  qualities 
of  all  thraa  different  types  of  CER's.  All  tha  CER's  will  predict  tha 
respective  costs  associated  with  tha  P-14,  lots  one,  two  and  at  a 
normal i sad  quantity  of  100  alrfraaas.  These  predictions  will  than  ba 
used  for  CCR  comparison.  Lastly,  In  Chapter  VI,  this  work  will  be 
summarised  and  conclusions  will  ba  presented. 
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II.  Components  of  Variance  Modal 

This  chapter  will  first  develop  the  general  technique  utilised  to 
obtain  the  CER's  for  the  nixed  effects  situation.  Then  It  will  go  Into 
a discussion  and  short  outline  of  the  development  of  the  technique  used 
to  estlaate  the  variances.  This  last  portion  of  the  chapter  la  the 
crucial  part  of  adapting  the  components  of  variance  model  to  airframe 
cost  estimation. 

Notation 

Matrix  notation  will  be  utilized  throughout  this  paper.  Capital 
letters  will  be  used  to  designate  vectors  and  matrices.  The  transpose. 
Inverse,  and  generalized  Inverse  of  a matrix  will  be  denoted  by  A', 

A" * , and  A" , respectively. 

An  estimator  of  a parameter,  variable,  vector  or  matrix  will  be 

A 

denoted  by  a hat  over  the  symbol.  For  example,  A would  be  an  e.-  tlmate 
of  A. 

A vector  whose  elements  have  been  replaced  by  the  mean  of  Its 
elements  will  be  indicated  by  a bar  over  the  symbol,  e.g.,  A. 

The  trace  of  a matrix,  which  Is  the  sum  of  the  diagonal  elements 
of  a matrix,  will  be  denoted  by  tr,  e.g.,  trA. 

Ordinary  Least  Squares 

The  standard  linear  model  Is  given  bv 

Y - XB  ♦ € (1) 

where  Y is  a -actor  of  observations  (dependent  variable),  X la  a vector 
of  explanatory  variables,  and  € Is  the  disturbance  vector.  The 
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Miapdoni  aid*  with  this  aodsl  srs  (Ref  14ill0-lll)i 

1.  The  expected  value  of  the  disturbance  vector  Riven  the 


E (€|X)  - 0 


2.  The  variance-covariance  matrix  of  €,  given  X is 


VAR  (€|X)  - a2 1 


where  oL  Is  an  unknown  positive  number  and  1 Is  an  n x n 
Identity  matrix. 

3.  The  X|  variables  (1*1,  1,  . . . k)  are  linearly  Independent. 

4.  The  n -element  disturbance  vector,  €,  is  assumed  to  be 
normally  distributed. 

The  Ordinary  Least  Squares  (OLS)  procedure  is  used  to  obtain  an 
estimate  of  the  unknown  coefficient  vector  B so  that 

T - XB  <4) 

A 

Now  one  can  define  € as  a vector  of  residuals, 

A 4 

€ - Y - T (5) 

Without  outlining  all  the  well  known  details  the  OLS  solution  for  & is 


B - (X'X)"1  X*Y 


The  derived  properties  of  OLS  estiMtes  are  (Ref  14:111-115)1 

A 

1.  B Is  unbiased. 

2.  An  unbiased  estimator  of  the  covariance  matrix  for  £ Is 


Var  (B)  - o2  (X'X)-1 
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An  unbiased  estimator  for  the  unknown  parameter,  o , is  given 


General ised  Least  Squares 

In  the  standard  linear  model,  Eq  (1),  it  is  assumed  that  the 
random  disturbances  are  uncorrelated  and  Identically  distributed.  This 
results  in  a diagonal  conditional  covariance  matrix  with  o2  on  the 
diagonal. 

Now,  If  assumption  two  above  is  violated,  the  conditional  co- 
variance  matrix  is  no  longer  diagonal; 


Var  (€ |x)  - a2V  (9) 

o 

where  a is  an  unknown  positive  parameter  and  V is  a known  symmetric 
positive  definite  n x n matrix  whose  trace  equals  n. 

Because  V is  a symmetric  positive  definite  matrix,  its  Inverse  has 
the  same  properties.  Therefore,  there  exists  a non-singular  matrix  P 
such  that 


P'P  - V 


To  find  an  estimator  of  B we  transform  the  standard  linear  model  by 
multiplying  through  both  sides  by  P: 


PY  - PXB  ♦ P€ 


The  expected  value  of  P€  is  still  equal  to  sero  and  the  conditional 
covariance  matrix  of  P€  given  X is: 


**  .r 


n 





j 


\ 
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Var  (K|X)  - P [var  (€|x)  ] P* 

- P (a2V)  P‘ 

- a2  P(P'P)"1  P* 


- *2I 


(12) 


So*  after  the  transformation,  the  new  model  does  meet  the  assumptions 
discussed  as  given  wider  the  standard  linear  model.  Therefore*  OLS  can 
be  utilised  to  obtain  an  estimate  of  the  coefficients. 

Under  GLS  an  estimate  of  B la 


B - (X'V"lX)’1  X'V‘lY 

with  the  conditional  covariance  matrix  given  by 
Var  <B|X)  - a2(X'V’lX)-1 

2 

and  a is  estimated  by 

a2  - (Y-XB)'V_1(Y-XB) 


(13) 


(14) 


(15) 


where  k is  the  number  of  explanatory  variables  used  for  regression. 

One  csn  obtain  a much  more  detailed  development  of  GLS  estimators  from 
Thell  (Ref  14:237-240). 

Extension  of  GLS  to  Airframe  Cost  Estimation 

As  stated  in  the  introduction  to  this  paper*  if  one  assumes  that 
there  are  two  sources  of  error  in  airframe  cost  estimation*  then  some- 
how the  CER's  developed  must  take  this  into  account.  These  two  error 
terms  are  assumed  to  be  normally  distributed  and  Independent  of  one 
another.  This  essentially  causes  two  problems.  (1)  The  conditional 
covariance  matrix  Is  not  diagonal  and  therefore  GLS  must  be  used  to 
obtain  the  CER's.  (2)  One  must  obtain  an  estimate  of  the  two  different 
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variances  to  obtain  an  estimate  of  V.  Ones  an  estimate  of  V Is  obtained, 
than  It  can  bs  used  to  replace  V In  Eq  (13)  to  estimate  the  elements  of  B. 
Nov,  let  us  assume  that  the  components  of  variance  model  to  be  used 

Is 


Xjj  B ♦ Uj  ♦ €tJ 


(16) 


uj  Is  the  error  term  associated  with  different  type  airframes  and  €jj 
Is  the  overall  arror  term. 

It  Is  appropriate  here  that  we  discuss  the  relevance  of  two  error 
terms  In  this  new  model.  The  error  associated  with  different  aircraft 
alrfrasMs  could  be  tied  to  several  factors.  The  error  could  result  from 
totally  new  aircraft  designs,  new  materials  used,  and  new  and  more  com- 
plicated design  features,  etc.  These  aircraft  dlfferances  would 
definitely  affect  cost  but  they  may  not  be  totally  captured  by  the  ex- 
planatory variables  utilised  In  the  CEB's.  The  overall  error  term,  on 
the  other  hand,  would  most  probably  reflect  the  general  economic  condi- 
tions present  under  different  contracts,  e.g.,  small  changes  In  labor 
costs  caused  by  Inflation  or  strikes,  slight  changes  In  design  from  one 
lot  to  another,  etc. 

Now  It  will  be  assumed  that  the  distributions  of  both  uj  and  €^j 

are  normal,  lndapendent  of  one  another,  and  have  variances  equal  to 

2 2 

Ojj  and  Of  , respectively  (Ref  16  i!3). 


i > 
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Nov,  lot  g roproaont  tho  ovorall  error  term  for  tho  Models 


*IJ  * uj  * €lj 


(17) 


whore  gy  la  distributed  normally  with  a mean,  a,  and  a variance  equal 
to  Oy2  ♦ Oj2  s 

g^  AJ  N(a,  Oy2  e o€2) 

Then  tho  variance  of  g|j  la  equal  to 


Var  (gj j ) 


i 


i 


I 

I 

e 

-°  vrj 

where  r equate  the  nianber  of  different  airframes,  zeroes  are  on  the  off 
main  diagonal,  and  Vj  la  a symmetric  matrix. 
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where  J • 1,  2,  . . . , r and  the  else  of  the  matrix  la  a x m with 
m equal  to  the  number  of  lots  produced  of  the  jth  aircraft. 

The  development  of  the  value  for  the  off-diagonal  elements  of  Vj 
follows  easily  from  the  basic  assumptions  concerning  Uj  and  €jj.  The 
off-diagonal  elements  correspond  to  the  covariance  between  gjj  and 
gk^,  l.e. , the  covariance  between  two  different  lots  of  the  same  type 
aircraft. 

Covariance  (g1Jt  g^)^ 

- E [(uj  - a ♦ €tJ)  (uj  - a ♦ €kj)] 

- E [(uj  - a)  (uj  - a)  ♦ (uj  - a)  €U 

♦ (Uj  - a)  €kj  ♦ €tj  €kj  ] (1* 


Since  the  assumptions  cause  E [<uj  - a)  €|j],  E [(uj  - a)  €k^],  and 
E [€jj  t0  «<iual  sero,  then  the  covariance  (gjj»  g^)  ■ 

E [(Uj  - a)(Uj  - a)],  which  by  definition  equals  . 


The  general  model  therefore  becomes 


! 

■i 


Y • XB  ♦ C (19) 

with  G distributed  normally  with  a mean,  a,  and  a variance  equal  to  V: 

G ' N(a,  V) 

If  one  then  has  estimates  for  the  two  variance  components,  then  it 

A 

would  be  a simple  matter  to  build  the  V matrix  and  then  estimate  the 
coefficients  of  the  model  (CER)  using  Eq  ( 1)  (Ref  16  : 13-14). 


Estimation  of  the  Variance  Components 

The  actual  thrust  of  this  thesis  is  built  around  the  variance 
components.  Up  to  this  point  nothing  has  been  said  about  the  actual 
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method  utilised  to  estimate  these  values.  After  an  extensive  literature 
search  (see  Supplementary  Bibliography,  Ref  1,  2,  3,  A,  5,  7,  8),  a 
■at hod  reported  by  Searle  was  chosen  (Ref  12:463-470).  This  aethod  of 
variance  coaponent  estlaa  on  uses  the  fitting  constants  method  (coon only 
referred  to  as  Henderson's  Method  3)  and  an  Iterative  technique  developed 
by  Thompson  Involving  the  ratio  o^/Oy2  (Ref  13:767-773).  The  estlaated 
components  obtained  from  the  technique  are  both  unbiased  and  maximum 
likelihood.  This  technique  was  the  only  one  discovered  that  obtains 
estlaates  with  the  above  properties  when  the  data  is  unbalanced. 

Outline  of  Variance  Coaponent  Estimation  Technique 

The  following  portion  of  this  chapter  Is  a very  general  development 
of  Che  variance  component  estimation  technique.  The  general  model  la 

Y - XB  ♦ ZU  ♦ € (20) 

where  in  our  case,  the  vector  Y Is  natural  logarithm  (In)  of  cost 

(dependant  variable),  X Is  the  In  of  the  matrix  of  explanatory  variables, 

which  Is  of  rank  r,  Z is  a matrix  of  Indicator  variables  (ones  and  zeroes) 

of  full  rank,  B Is  a vector  of  unknown  fixed  constants,  U represents  the 

effects  of  a single  random  factor  (In  our  case  different  type  aircraft) 

2 

having  a variance  of  Oy  , and  € la  the  random  disturbance  term  over  all 
aircraft  and  lots  (Ref  12:465). 

For  alrfraae  cost  estimation  the  X and  Z matrices  will  be  constructed 

as  follows: 


In  S In  W In  Q 
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where  the  niabtr  of  rows  is  equal  the  number  of  separate  airframe  lot 
observations.  One  will  notice  that  a column  of  ones  was  not  used.  The 
Z matrix  will  consist  of  as  many  columns  as  there  are  different  air- 
frames and  in  each  column  there  will  be  ones,  where  the  number  of  ones 
will  correspond  to  the  total  numbpr  of  lots  of  that  particular  airframe 
built.  The  ones  wll|  be  in  the  rpwg  corresponding  to  the  rows  of  the 
respective  costs.  The  Z matrix  colusns  will  sum  to  one,  e.g.. 


1 

1 

0 

0 

0 

0 


0 

0 

1 

1 

0 

0 


0 

0 

0 

0 

1 

1 


2 2 

Now,  the  estimators  for  and  Gy  are 


4 2 _ Y*  Y - [R(U)  » R (B[U)] 
n - r(X) 


and 

42  R(U)  ♦ R(BlU)  - R(B) 

U tr  [Z'Z  - Z'X  (X'X)-X'Z] 


(21) 


(22) 


where  n is  the  total  number  of  observations  and  the  R'a  denote  the  sum 
of  squares  of  reduction  due  to  whatever  symbol  is  contained  in  the 
parentheses.  These  results  provide  an  Iterative  procedure  because  the 
reductions  K(U)  and  R(B|u)  Involve  ^ • Og2 /Ojj2 . The  estimation  is  done 
by  picking  an  initial  value  of  k (zero  is  acceptable)  and  then  calcu- 
lating Eqs  (21)  and  (22);  recalculating  k and  so  on  until  convergence. 

A detailed  development  of  this  procedure  and  the  resulting  equations  for 
the  different  reductions  are  contained  in  Appendix  A (Ref  12 :A6f>-470). 
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The  combination  of  the  components  of  variance  model  Eq  (16),  the 
generalised  least  squares  technique,  and  Searle's  method  of  esttaatlng 
the  variance  components  will  be  referred  to  as  the  RANDOM  technique 
throughout  the  resuilnder  of  this  study. 

Relationship  of  the  RANDOM  Technique  to  the  Techniques 
of  Tims on  and  Tihansky  and  Large,  jt  a) . 

This  thesis  is  comparing  the  results  of  three  different  CES  de- 
velopment techniques.  But,  actually  the  RANDOM  technique  alone  could 
indicate  the  better  of  the  three  methods.  The  variance  components 
estimated  as  outlined  above  should  give  a very  good  indication  of  which 
method  is  more  appropriate  for  a particular  CEK. 

One  can  see  from  the  construction  of  the  Vj  matrices  and  the  V 
matrix  (see  pages  13-14)  that  if  o^2  is  equal  to  zero,  the  V matrix 
reduces  to  the  diagonal  form  assiased  in  Tims on  and  Tihansky* s (TT's) 

study.  This  would  lead  one  to  believe  that  the  TT  CER's  are  best. 

A o 

On  the  other  hand,  if  Og  is  equal  to  zero,  then  the  Vj  matrices 

A o 

would  be  constructed  with  the  value,  ou  , only.  This,  of  course,  would 
make  all  the  Vj's  singular  and  this  in  turn  would  lead  to  a singular  V 
matrix.  The  solution  to  this  singularity  problem  would  then  be  to  limit 
each  Vj  matrix  to  a one  by  one  matrix  with  a value  of  5U2  which  reduces 

A 9 

the  V matrix  to  a square  matrix  with  Ojj  on  the  diagonal  and  zeroes 

elsewhere.  The  ruw/column  size  of  this  matrix  would  be  equal  to  the 

A 2 

number  of  different  type  airframes.  In  other  words,  with  Og  • 0 one 
is  now  faced  with  the  Large,  et  al.  (LAR)  approach. 
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that  neither  the  LAR  nor  IT  techniques  Is  best.  In  this  esse  one  could 
argue,  then,  that  somewhere  in  between  LAS  and  TT  is  the  better  tech- 
nique, RANDOM. 


A Possible  Test  of  Hypothesis 

One  could  carry  the  theory  put  forth  in  the  previous  section  one 
step  further.  If  a valid  test  statistic  were  available,  one  could  use 
a statistical  test  of  hypothesis  to  determine  if  one  or  the  other  of 
the  components  is  equal  to  sero  with  sane  specified  probability.  For 
example,  the  null  hypothesis  could  be 


Ho  : aH2  - 0 


or 


Ho  * a€2  • 0 

against  the  respective  alternate  hypothesis 

Ha  i Ojj2  4 0 


or 


Ha  s 0g2  4 0 


Unfortunately,  at  present,  there  is  no  test  statistic  derived  which 
is  directly  applicable  to  the  variance  components  to  be  estlaated  in 
this  study. 

One  could  sake  an  assertion  here,  though,  that  if  the  estlaated 
components  are  of  near  equal  magnitude,  there  is  no  strong  evidence  that 
either  component  is  equal  to  sero  and  the  other  is  not. 


■ 
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III.  2H  CER's  £o  b£  P»yil°Eafl  Da^g 

Four  airframe  cost  elements  will  be  addressed  In  this  paper.  These 
are  engineering  hours  (total)*  tooling  hours  (total)*  labor  hours 
(recurring)*  and  material  dollars  (recurring).  There  are  two  reasons 
these  four  elements  were  chosen  over  any  others.  First*  the  data  for 
these  cost  elements  Is  much  more  readily  available  and  more  consistent 
than  the  data  for  the  other  elements.  Secondly*  the  CER's  estimated  for 
these  four  cost  elements  will  be  adequate  to  determine  the  usefulness 
and  applicability  of  the  RANDOM  technique  as  compared  to  the  other  two. 

If  the  RANDOM  technique  proves  to  be  good  for  this  set  of  CER's*  then 
this  will  be  sufficient  cause  for  Investigation  of  the  technique  Into 
all  areas  of  airframe  cost  estimation. 

The  first  portion  of  this  chapter  will  discuss  the  functional  form 
of  the  CER's  and  the  Independent  (explanatory)  variables  to  be  utilised 
In  this  study.  The  last  sections  of  this  chapter  will  focus  on  the 
source  and  reduction  of  the  actual  data  to  be  utilised. 

Physical  Characteristics  (Explanatory  Variables) 

&Sl«£24  S3.  £2*3 

The  selection  and  acquisition  of  military  aircraft  begins  with 
little  more  than  a requirement  for  a certain  performanca  and/or  a specific 
set  of  physical  attributes  (maxim  10  speed*  weight*  etc.).  Prior  to 


1975  It  was  generally  thought  that,  of  the  characteristics  available 
In  the  early  conceptual  stages  of  aircraft  development*  the  only  ones 
with  any  statistically  proven  parametric  significance  were  airframe  unit 


“t  r-  -mjw-.* 
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weight,  maximum  speed  at  best  altitude,  production  rate  and  production 
quantity.  (Airframe  unit  weight  la  defined  as  empty  weight  sinus  the 
following!  wheels,  brakes,  tires,  and  tubes;  engines- 'main  and  auxili- 
ary; rubber  or  nylon  fuel  cells;  starters— aain  and  auxiliary;  pro- 
pellers; auxiliary  power-plant  unit;  Instruments;  batteries  and 
electrical  power  supply  and  conversion;  avionics  group;  turrets  and 
power-operated  mounts;  air  conditioning,  anti-icing  and  pressurization 
units  and  fluids;  cameras  and  optical  view  finders;  trapped  fuel  and 
oil.)  (Ref  8:19).  Therefore,  the  CER's  developed  most  often  use 
different  combinations  and  subsets  of  these  four  explanatory  variables. 

Functional  CER  Forms 

The  functional  forms  for  the  cost  estimating  equations  themselves 
that  conform  best  to  airframe  cost  data  and  any  least  squares  technique 
are  the  log-linear  and  exponential  forms: 


(Log-linear)  Y - e*SbWce€ 

(23) 

(Exponential)  Y - e*SbWc  ♦ € 

(24) 

where  Y Is  the  dependent  variable  (cost  to  be  estimated),  V denotes  unit 
weight,  S denotes  maximum  aircraft  speed  at  best  altitude,  e Is  the  base 
of  the  natural  logarithms,  € Is  the  normally  distributed  error  term,  and 
a,  b,  and  c are  the  constants  to  be  estimated.  The  logarithmic  form  can 
be  "linearised"  for  the  application  of  the  least  squares  method  by  taking 
the  natural  logarithm  of  both  sides  of  the  equations: 

(Log-linear)  In  Y - a+blnS+clnWeln€  (25) 
The  only  difference  between  the  two  forms  Is  the  manner  In  which  the 
error  term  € la  treated  (multiplied  or  added). 
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Of  eh*  two  forms  listed  In  th«  preceding  paragraph,  eh*  log- linear 
Is  most  often  used  (Ref  8;  Ref  13).  This  is  melnly  due  eo  the  fact  that 
this  form  conforms  better  to  the  reel  world.  The  error  distribution  is 
skewed  so  that  at  a given  confidence  level  (a  statistically  determined 
interval  within  which  a CER  estimate  of  cost  is  said  to  lie  between  with 
a certain  probability)  the  lower  bound  of  the  interval  is  not  as  great 
as  the  upper  bound  which  is  analogous  to  the  chances  being  greater  for 
a cost  overrun  on  a contract  than  an  underrun  (for  the  linear  or  expo- 
nential models,  the  aaximua  underrun  and  overrun  are  of  equal  magnitude). 
Also,  the  range  of  the  logarithmic  CER  is  from  aero  to  positive  infinity 
(no  negative  cost)  whereas  the  exponential  CER  ranges  from  minus  infinity 
to  plus  infinity.  There  are  other  lesser  advantages  to  the  log-linear 
form  and  they  are  explained  in  the  report  by  Timson  and  Tlhansky  (TT) 

(Ref  13:4-13). 

This  study  will  use  the  same  three  variables  for  all  CER* a.  esti- 


mated: 

1.  Maximum  speed  at  best  altitude  (S) 

2.  Unit  airframe  weight  (W) 

3.  Quantity  of  airframes  produced  (Q). 

The  Large,  et  al.  (LAR)  CER's  do  not,  of  course,  use  Q explicitly. 
Therefore,  a version  of  the  LAR  approach,  which  does  use  Q,  is  developed 
for  comparison.  These  variables  are  representative  of  the  variables 
used  for  most  previously  developed  CER's  in  this  area  of  study.  Since 
the  same  variables  will  be  used  for  ell  three  types  of  CER's,  the  compari- 
son of  the  approaches  should  be  valid. 
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Dot*  Source 

The  data  base  used  by  this  report  was  compiled  by  the  RAND  Corpora- 
tion in  conjunction  with  the  LAR  study  done  for  the  Assistant  Secretary 
of  Defense  (Ref  8).  The  date  Is  contained  on  work  sheets  prepared  by 
RAND  and  these  are  kept  on  file  at  the  Cost  Library,  Aeronautical 
Systems  Division,  Wright-Patterson  Air  Force  Base,  Ohio. 

£££  Seduction 

The  data  actually  utilised  to  develop  all  the  CER's  is  a snail 
subset  of  the  total  base  available  fron  RAND.  Nine  airframe  types  are 
included  and  the  total  nuaber  of  lots  Is  equal  to  33.  The  aircraft 
included  in  this  subset  are  fighter  and  trainer  types  only.  These  air- 
craft and  the  respective  number  of  lots  arex 

A-4D  (3)  F-105B.D  (4)  F-4A.B  (4) 

F-102A  (4)  F-106A.B  (5)  T-38A  (3) 

F-104A,B,C  (3)  A-5A,C  (4)  F-111A,C,E  (3) 

The  cost  data  fo’-  these  airframes  Is  presented  in  Appendix  C.  The 
number  of  airframes  and  the  cost  of  each  lot  Is  adjusted  for  major  design 
changes.  Also,  where  the  number  of  airframes  In  actual  contracts  Is 
very  small  and  no  major  design  changes  were  made,  these  small  contracts 
were  combined  with  preceding  or  succeeding  contracts  to  provide  a more 
consistsnt  observation  set  for  use  in  the  CER  development.  The  LAR  and 
TT  studies  both  proceeded  In  this  same  maimer. 

Handel,  in  a thesis  on  airframe  cost  estimation  utilised  the  same 
procedures  as  above  and  his  data  set  Included  all  nine  of  the  airframes 
to  be  used  in  this  thesis.  After  careful  examination  of  his  data  and 
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referencing  the  RAND  work  sheet*.  It  was  determined  that  hla  data  would 
be  adequate  for  this  study  (Ref  4:26-34). 

The  data  for  the  four  cost  elements  to  be  studied  can  ba  broken 
down  into  recurring  and  non-recurring  dollars/hours.  Previous  studies 
by  the  RAND  Corporation  have  Indicated  that,  where  possible,  recurring 
cost  data  should  be  utilised  for  CER  development.  But,  in  these  same 
studies  discrepancies  were  discovered  in  the  contractor  reported  data 
for  tooling  and  engineering  costs.  As  a result,  only  total  tooling  and 
engineering  costs/hours  were  estimated  in  this  study  (Ref  8:16).  The 
labor  and  Material  hours/costs  are  of  the  recurring  type  only. 

£g Si.  Element  Definitions 

The  tooling  costs  consist  of  the  material,  labor,  and  overhead 
costs  for  the  assembly  tools,  dies,  jigs,  fixtures,  work  stands,  and 
test  equipment  needed  for  the  production  of  an  airframe.  The  number 
of  direct  labor  hours  used  for  tooling  Is  highly  correlated  with  the 
tooling  cost  and  therefore  CXR's  can  be  developed  using  tooling  hours. 
The  tooling  cost,  then,  can  be  obtained  by  multiplying  the  number  of 
direct  tooling  hours  by  a composite  hourly  rate,  taking  into  account 
all  those  cost  Items  mentioned  earlier  (Ref  4:26-27). 

The  total  engineering  costs  consist  of  the  preliminary  design 
effort  and  integration  and  of  the  material,  labor,  and  overhead  costs 
expended  in  the  engineering  for  the  basic  airframe  and  of  the  system 
engineering  performed  by  the  prime  contractor.  Total  engineering  hours 
can  be  utilised  In  the  exact  same  way  as  tooling  hours,  with  the  total 
engineering  cost  estimated  by  applying  a composite  rate  to  the  number  of 
engineering  labor  hours  expended  (Ref  4:27-28). 


24 


G0R/SM/76D-10 


The  recurring  cost  of  Manufacturing  Material  Includes  the  costs  of 
raw  and  seal fabricated  Materials,  purchased  parts  and  purchased  equip* 
Ment.  The  dollar  cost  of  the  Manufacturing  Material  was  adjusted  to 
1973  dollars  with  the  use  of  the  price  adJustMent  Indices  listed  In 
Table  1,  so  any  Material  recurring  costs  estlMated  by  CER's  in  this 
study  are  the  constant  year  dollar  sun  of  Material,  purchased  equlpnent, 
and  government  furnished  equipment  (Ref  4:32), 

Lastly,  Manufacturing  recurring  labor  hours  consist  of  the  direct 
labor  required  to  Machine,  process,  asseMble,  and  fabricate  the  Major 
structure  of  an  airfrane  (Ref  4:29). 

The  CER's 

In  this  thesis  there  will  be  actually  six  different  CER's  estinated 
for  each  cost  elenent.  These  CER's  differ  with  respect  to  the  technique 
used  (RANDOM,  LAR,  or  TT)  and/or  the  way  in  which  the  data  is  utilised 
(data/cost  approach). 

SHPVMtlvt  C°JJL  Approach,  emulative  airfrane  cost  and  cumulative 
airfrane  quantity  lot  data  will  be  used  with  both  the  RANDOM  and  TT  tech- 
niques to  develop  a cunulative  RANDOM  CER  and  a cumulative  TT  CER.  The 
speed  and  weight  used  will  be  that  of  the  last  airfrane  produced  in  each 
lot. 

Marx Inal  Cost  Approach.  The  second  cost  approach  will  again  be 
utilised  by  both  the  RANDOM  and  TT  techniques.  This  approach  will  use 
average  coet  per  airfrane  per  lot.  To  obtain  this  average  cost  per  lot 
the  actual  lot  cost  is  divided  by  the  nuaber  of  alrfraaes  produced  in 
the  lot.  This  can  be  derived  fron  the  cumulative  cost  data  presented 
in  Appendix  C.  This  cost  will  be  termed  Marginal  Cost  in  this  thesis. 
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Table  I 

Price  Adjustment  Indices 
1973 

Tear 

Material 

Equipment 

1932 

2.625 

2.972 

1933 

2.480 

2.806 

1954 

2.359 

2.636 

1955 

2.224 

2.506 

1956 

2.081 

2.353 

1957 

1.970 

2.226 

1958 

1.859 

1.078 

1959 

1.793 

1.981 

1960 

1.718 

1.892 

1961 

1.672 

1.833 

1962 

1.614 

1.756 

1963 

1.579 

1.696 

1964 

1.528 

1.632 

1965 

1.479 

1.568 

1966 

1.422 

1.496 

1967 

1.359 

1.422 

1968 

1.295 

1.343 

1969 

1.208 

1.249 

1970 

1.177 

1.188 

1971 

1.137 

1.138 

1972 

1.094 

1.081 

1973 

1.000 

1.000 

\ 
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This,  of  course,  rssults  In  two  different  CER's  eiso  - the  Marginal 
RANDOM  and  Marginal  TT  CER's. 

There  Is  a good  reason  for  deriving  these  marginal  CER's.  This 
study  has  acknowledged  the  fact  that  there  may  be  correlation  In  the 
data  frosi  the  lots  of  the  same  type  airframe.  The  RANDOM  technique 
will  hopefully  accoisit  for  this  fact.  But,  there  may  be  another  type 
of  correlation  taking  place  when  one  uses  the  emulative  cost  approach. 
Auto-correlatlon  could  result  from  the  fact  that  each  successive  lot 
cost  (emulative)  includes  the  cost  of  the  preceding  lot(s).  if  present, 
this  auto-correlatlon  affects  the  statistics  associated  with  the  cumu- 
lative CER's.  One  conmtan  result  of  this  type  of  data  correlation  Is  a 

2 

larger  than  actual  coefficient  of  determination,  R , thus  Indicating 
that  the  CCR  fits  the  data  better  than  it  actually  does.  The  marginal 
cost  approach  eliminates  the  possibility  of  this  auto-correlatlon. 

The  transformation  of  the  RAND  explanatory  variable  data  to  marginal 
data  Is  quite  Involved  though.  The  variables  of  weight  and  speed  will 
refer  to  the  same  values  as  mentioned  In  the  cumulative  cost  approach 
definition,  but  the  airframe  quantity  explanatory  variable  will  refer  to 
a quite  different  value.  In  this  marginal  cost  approach  Q will  sym- 
bolise the  true  lot  mid-point  of  each  lot.  Again,  this  approach  will 
only  be  applied  with  the  RANDOM  and  Tims on  and  Tlhansky  techniques. 

To  discuss  the  meaning  and  derivation  of  true  lot  midpoints  one 
must  discuss  the  existence  of  a production  learning  curve.  In  the 
aerospace  Industry,  for  example,  there  is  empirical  evidence  that  there 
exists  a learning  process  or  phenomenon  that  causes  a reduction  in  pro- 
duction cost  as  the  number  of  items  produced  Increases.  Although  there 
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tn  Mvtral  hypotheses  on  the  exact  manner  in  which  this  reduction  takes 
place,  the  basis  of  learning-curve  theory  Is  that  each  tins  the  total 
quantity  of  Items  produced  doubles,  the  cost  per  Item  Is  reduced  to  some 
constant  percentage  of  the  previous  cost  (Kef  1:1). 

Most  learning  curve  slope  derlvatl  deal  with  cumulative  average 

cost: 

v m axb  (26) 

C « 

where  a Is  the  cost  of  the  first  Item  produced,  x Is  the  cumulative 
number  of  Items  produced,  b is  an  exponent  that  measures  slope,  and  yc 
Is  the  average  cost  of  all  Items  produced  up  to  and  Including  x.  The 
learning-curve  slope,  s,  describes  the  average  cost  of  2x  Items  as  a 
fraction  of  the  average  cost  of  x Items  and  Is  related  to  b as  follows: 


s • 2b  or  b « (27) 

Log  2 

Now,  In  our  case  we  are  dealing  with  total  cianilatlve  cost  and  this 
changes  Eq  (26): 

t - ycx  » total  cost  of  x Items  (28) 


and  therefore 


b 

yc  - ax  x 
b+1 

* ax 


The  above  changes  Eq  (27)  es  follows: 

Los  s 


b * tsrt  ♦ 1 


(29) 


(30) 


or 


b - 1 


Log  s 
Log  2 


(31) 


With  the  above  results  then  one  can  estimate  the  unit  learning- 
curve  slope  given  some  value  for  b by  using  the  equation 
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s - e(b*l)  Lor  2 (32) 

Thla  la  tha  aquation  used  by  thla  study  to  estimate  tha  alopa  of  tha 
unit  1 naming  curva  for  all  four  coat  elements.  Tha  value  uaad  for  b 
In  thla  raport  la  tha  exponent  of  tha  airframe  quantity  explanatory 
variable  aa  eat  1 mated  by  tha  RANDOM  technique  under  tha  cumulative  coat 
approach. 

Once  an  estimate  of  tha  wit  learning-curve  alopa  la  obtained  ualng 
Bq  (32)  than  the  learning-curve  tablea  derived  by  Boren  and  Campbell  of 
tha  RAND  Corporation  can  be  utilised  to  obtain  tha  true  lot  aldpolnte. 

Tha  procedure  and  theory  are  glvon  In  much  greater  detail  In  the  publi- 
cation titled  Military  Equipment  Coat  Analvala.  and  alao  In  the  Boren 
and  Campbell  volumes  of  learning-curve  tables  (Ref  11:93-123;  Ref  1: 

Vol.  1,  1-13). 

One  can  see  from  the  above  explanation  that  thla  report's  estimate 
of  a involves  an  estimate  of  b.  To  be  sure,  there  Is  a great  possi- 
bility of  significant  error.  But,  because  ve  are  dealing  with  rela- 
tively small  lot  quantities,  even  a large  estimation  error  results  In 
true  lot  midpoint  errors  of  only  one  or  two  airframes. 

Iechnloue/Coat  Approach.  The  fifth  and  sixth  CER's  to  be  pre- 
sented can  be  described  as  being  derived  with  either  different  techniques 
and/or  different  cost  approaches.  The  fifth  CER  derivation  will  be  by 
the  use  of  the  LAR  technique  as  presented  In  their  report.  This  tech- 
nique can  best  be  described  by  the  way  In  which  data  Is  utilised  (cost 
approach).  The  cost  will  be  the  cumulative  cost  (estUsated)  of  a quan- 
tity of  100  airframes.  This  will  be  called  the  LAR  CER. 

The  historical  airframe  cost  data  very  seldom  has  an  actual  lot 
cost  observation  at  cumulative  quantity  100,  so  the  following  procedure 
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was  utilised  In  the  LAR  and  Handel  studies  to  obtain  the  observations. 

They  assumed  a log-linear  relationship  for  learning.  Than  actual  cumu- 
lative costs  versus  the  corresponding  ciamilatlve  quantities  were  plotted 
on  logarithmic  graph  paper  for  each  airframe  type.  A straight  line  was 
then  drawn  through  these  plot  points  immediately  preceding  and  succeeding 
quantity  100.  The  cost  observation  for  this  quantity  was  then  read  off 
the  graph  (Ref  8:45).  This  method  implies  a cumulative  average  learning 
curve  as  compared  to  the  unit  learning  curve  assumed  in  the  marginal 
approach  explained  earlier.  Even  with  these  two  different  assumptions, 
the  LAR  CER's  and  the  two  marginal  CER's  may  be  compared  because  the 
differences  would  be  very  slight  in  the  lot  sizes  used  in  this  study. 

The  last  CER  to  be  developed  in  this  study  will  be  a cumulative 
total  cost  CER.  Now  the  technlque/co3t  approach  used  to  derive  this 
relationship  is  very  similar  to  the  technique/cost  approach  described 
for  the  LAR  CER.  The  difference  is  that  total  airframe  program  cost  will 
be  used  as  the  dependent  variable  and  the  total  airframe  quantity  will  be 
explicitly  utilized  as  an  explanatory  variable,  e.g. , the  total  cost  and 
quantity  of  all  the  F-4  airframes  produced.  This  CER  is  developed  to 
compare  the  learning  curve  approach  of  LAR's  technique  to  a very  similar 
technique,  i.e.,  the  same  number  of  observations,  but  no  estimation  of 
a learning  curve  to  obtain  some  normalized  airframe  quantity  data  for 
regression. 

Summary.  The  preceding  sections  define  the  six  CER's  to  be  developed 
and  analysed  in  this  study.  Again,  they  are  the: 

1.  Cumulative  RANDOM  CER 

2.  Cumulative  TT  CER 

3.  Marginal  RANDOM  CER 
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4.  Marginal  TT  CSS 

5.  UK  CER 

6.  Cuaulative  total  cost  CER 

One  aust  reaaabar  that  tha  UR  and  cuaulatlva  total  cost  CER  techniques 
actually  Involve  different  cost  approaches  (they  both  utilise  the 
ordinary  least  squares  statistical  technique).  The  TT  and  RANDOM  tech- 
niques utilise  alrfraae  lot  observations  and  therefore  cannot  be  utilised 
with  the  UR  or  cuaulative  total  cost  data  approaches.  Conversely,  the 
UR  and  cuaulative  total  cost  techniques  cannot  use  the  cuaulative  or 
aarglnal  lot  data. 

Tables  II  and  III  list  all  the  alrfraae  types  and  the  associated 
data  to  be  used  to  estlaate  all  six  CER's. 
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II  (continued) 


*The  Q variable  haa  different  values  associated  with  each  cost  element  (the  learning  curve  slope 
changes).  The  values  given  above  are  for  engineering  only.  See  Table  III  for  other  cost  el am 
Q values. 
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IV.  CgR  Dev 


Pr>«tnt<tlon.  ml  AmItmi 


To  reiterate,  this  thesis  focuses  an  the  RANDOM  technique  of 
developing  airframe  CER's.  To  analyse  the  results  of  this  development 
there  must  be  sane  frames  of  reference.  The  frames  utilised  in  this 
study  are  two  past  airframe  cost  estimating  reports,  the  one  by  Tlmson 
and  Tlhansky  (TT)  and  the  other  by  Imrge,  Campbell,  and  Cates  (1AR).  To 
assure  the  proper  comparisons  all  three  techniques  vara  used  to  develop 
the  exact  same  CER's  using  the  exact  same  data.  The  comparisons,  then, 
that  will  follow  In  this  chapter  will  be  aa  equitable  and  valid  as  is 
possible. 


CER  Values  to  be  Presented 

The  actual  presentation  of  the  CER's  will  include  all  the  coeffi- 
cients estimated  and  several  associated  statistics.  The  statistics  will 
be  the  coefficient  of  determination,  R2,  for  each  CER,  the  Student 
t-ratlo  for  each  coefficient,  the  estimated  residual  variance,  Og2»  for 
the  TT  and  LAR  CER's,  and  finally  both  CTg2  and  Jy2  for  the  RANDOM  type 

CER's  (the  variance  components). 

The  coefficient  of  determination  Is  defined  as: 

,2.1.  « - ;>•  « - j>  <33, 

Of  - Y)'  (Y  - Y) 

2 

This  particular  R statistic  Is  uncorrected  for  degrees  of  freedom  but 
It  will  serve  a useful  purpose  for  CER  comparisons.  This  will  tend  to 
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yield  relatively  higher  K values  for  the  LAR  CER's  since  they  are 
characterised  bv  few  degrees  of  freedom. 

The  Student  t-ratlo  Is  used  to  test  the  hypothesis  that  a coeffi- 
cient equals  zero  or  otherwise.  The  t-ratlos  will  be  placed  In  paren- 
theses below  the  respective  coefficients  In  the  tabular  presentation  of 
the  CER's.  The  t-ratlo  presented  here  Is  a standard  statistic  associated 
with  each  estimated  CER  coefficient.  It  is  calculated  by  dividing  the 

sin  of  squares  due  to  regression  (for  a particular  coefficient)  by  the 

A 

residual  standard  deviation,  oR.  It  Is  the  test  statistic  for  deter- 
mining the  significance  level  of  each  estimated  coefficient,  basically, 

the  larger  the  statistic,  the  greater  the  significance. 

. a 2 a 2 

The  variance  components,  Og  and  Oy  , were  estimated  using  the 
Searle  technique  and  all  the  estimations  were  obtained  after  10  Itera- 
tions. An  example  computer  program  Is  provided  In  Appendix  B.  The 
determination  of  10  Iterations  was  made  after  an  extensive  number  of 
computer  runs.  In  all  cases  except  the  Engineering  Marginal  cost 
approach  the  variance  component  values  converged  rapidly  and  after  five 
iterations  the  values  changed  beyond  the  fifth  decimal  place  only.  The 
number  10  was  chosen  for  even  more  accuracy  (Ref  2 :23-23). 

The  symbols,  variables,  and/or  acronyms  used  for  the  tabular  pre- 
sentation of  the  CER's  (excluding  some  which  are  already  defined)  are 
presented  In  Table  IV  and  explained  In  the  succeeding  paragraph. 

The  variable:  and/or  acronyms  defined  in  Table  IV  will  be  used 
as  follows.  The  name  of  a particular  CER  will  be  denoted  by  a combina- 
tion of  symbols,  llways  beginning  with  the  symbol  for  the  applicaole 
cost  element,  the^i  followed  by  the  svmbol  describing  the  cost  approach. 
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Table  IV 
Definitions 

-i 

Acronym  and/or  Variable 

Definition 

Total  engineering  Hours 

Total  Tooling  Hours 

Recurring  Manufacturing  Labor  Hours 

Recurring  Materiel  Dollar  Cost 
(1973  dollars) 

RANDOM  Technique 

Tlmson  and  Tlhansky  Technique 

Large*  Campbell,  and  Cates  Technique 

Cumulative  Cost  Approach 
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than  finally  followed  by  the  symbol  denoting  the  technique  used  to  de 


rive  the  particular  CER,  e.g.,  LCR  will  denote  the  total  Labor  hours  CER 


using  the  Cumulative  cost  approach  and  j^lNDOM  technique  for  its  derlva< 
tlon.  Two  other  examples  would  be  tttt  which  means  gaterlal,  Marginal, 


Jims on  and  Tlhansky,  and  LCIMT  which  denotes  Labor  hours  using  the 
CUHulatlvo  Total  program  cost  approach/ technique. 


Thirty-three  observations  of  nine  fighter  type  aircraft  were  used 


to  derive  these  CER's.  The  estimation  of  the  RANDOM  technique  variance 


components  utilised  10  Iterations.  The  resulting  CER's  are  presented 


In  Table  V 


The  t-ratio  statistics  for  all  the  engineering  hour  CER's  except 


the  cumulative  total  CER,  Indicate  that  of  the  four  explanatory  variables 


used  weight  and  quantity  are  by  far  the  most  significant 


The  cumulative  RANDOM  CER  variance  component,  a 


Is  much  larger 


This  indicates  that  icider  the  cumulative  cost  approach  the 


LAR  technique  19  appropriate.  This  Is  consistent  with  the  assumption 


that  there  is  between  lot  correlation  due  to  both  random  effects  and 


auto-correlatlon.  The  LAR  technique  Implicitly  takes  these  factors  Into 


account  and  therefore  may  be  fha  technique  best  suited  to  engineering 


cost  estimation 


All  the  cumulative  CER's  show  Inflated  R*  values  which  further 


Indicates  that  auto-correlatlon  Is  present  when  one  uses  the  cumulative 


co.^.t  approach 


The  R associated  with  the  cumulative  total  CER  is  lerger  than  the 


LAR  R . This  is  understandable  since  the  cumulative  total  approach  does 
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not  estimate  observations  as  the  1AR  approach  does.  This  cumulative 
total  technique  also  accounts  for  the  correlation  problems  in  the  same 
manner  as  the  LAR  technique.  These  findings,  then.  Indicate  that  the 
cumulative  total  CER  Is  even  more  appropriate  than  the  LAR  CER  for  an 
engineering  hour  < emulative  CER. 

The  marginal  RANDOM  CER  variance  components  Indicate  the  reverse 

of  the  respective  cumulative  findings.  That  is,  a^2  Is  much  larger 
^ 2 

than  CTy  . This  indicates  that  under  the  marginal  approach  for  engi- 
neering, the  TT  technique  is  most  appropriate.  This  finding  is  rot 
consistent  with  the  hypotheses  put  forth  in  this  study.  Ihls  would 

Indicate  that  there  is  no  data  correlation  problem. 

2 

The  R values  for  the  marginal  CER's  are  more  believable  (.smaller 

2 

than  the  cumulative  R *s).  This  is  a result  of  the  fact  that  auto- 
correlation is  no  longer  present  here. 

This  set  of  CER's  did  show  significant  inconsistencies  as  compared 
with  the  others.  The  iterative  routine  for  estimating  the  variance 
components  for  the  marginal  RANDOM  CER  did  not  converge  well.  This  was 
the  only  CER  development  that  displayed  this  characteristic.  This 
could  account  for  the  marginal  result  Indicating  the  TT  technique  is 
appropriate  when  In  fact  It  should  not  be.  The  data  Is  the  came  of 
these  problems.  One  can  see  that  the  non-recurring  costs  are  heavily 
loaded  Into  the  first  lot  of  all  the  airframe  types;  much  more  r.o  than 
In  the  other  cost  categories.  Recall,  the  engineering  costs  are  total 
costs  because  data  inconsistencies  prevented  the  separation  of  recurring 
and  non-recurring  elements  (see  page  26). 
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Because  of  the  problems  with  the  engineering  marginal  data  mentioned 
above  and  correlation,  the  most  appropriate  engineering  C ER  Indicated 
here  Is  the  cumulative  total  CER. 

Xotal  Tooling  Hours 

Again,  here,  33  observations  were  utilized  for  the  development  of 
all  the  tooling  hour  CER's  excluding  the  LAR  and  cueulatlve  total  types 
for  which  nine  observations  were  used.  Ten  Iterations  were  used  to 
estimate  the  variance  components.  Table  VI  presents  the  results. 

The  most  significant  coefficients  for  these  CER's  are  the  constant 
coefficient,  airframe  unit  weight,  and  quantity  (where  applicable). 

The  estimated  variance  components  for  the  cumulative  RANDOM  CER 

again  show  that  the  greatest  source  of  error  Is  due  to  different  types 

A 2 A 9 

of  airframes  In  the  observation  set,  i.e. , Is  much  larger  than  a ^ . 

As  stated  for  the  engineering  CER's  this  Indicates  that  the  LAR  CER  is 

more  appropriate  when  one  uses  cumulative  data. 

2 

The  cumulative-R  's  again  show  inflation  due  to  auto-correlation. 

The  R for  the  cumulative  total  cost  CER  is  greater  than  the  LAR  CER  R2, 
where  the  observations  are  estimated.  Therefore,  under  the  cumulative 
data  approach  the  cumulative  total  cost  CER  is  better  than  the  RANDOM, 

IAR  and  TT  CER's. 

The  RANDOM  variance  components  estimated  under  the  marginal  data 
approach  are  ven  nearly  equal  in  magnitude.  This  Is  consistent  with 
previous  assumptions  made.  The  marginal  approach  accounts  for  auto- 
correlation and  the  RANDOM  technique  uses  the  most  Information  (data) 
while  explicitly  accounting  for  the  between-lot  random  effects.  These 
factors  indicate  that  neither  the  LAR  nor  TT  techniques  are  appropriate. 
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as  hypothesised.  The  more  general  RANDOM  technique  Is  best  under  the 

tooling  cost  category. 

2 

The  marginal  R values  are  again  more  believable  than  the  respec- 
tive cumulative  R^'s.  Also,  the  RANDOM  R2  Is  the  largest  vhich  further 
defends  the  marginal  RANDOM  approach. 

In  light  of  the  above  results  for  the  tooling  cost  category,  the 
marginal  RANDOM  CER  la  recommended. 

Recurring  tabor  Hours 

The  same  type  airframe  data  Is  used  for  the  labor  CER's  as  was 
used  for  the  previous  two  sets.  The  results  are  shown  In  Table  VII. 

Ten  iterations  were  used  for  the  estimation  of  the  variance  components. 

The  t-ratio  statistics  point  to  the  same  coefficients  as  most 
significant  here  as  in  the  tooling  CER's. 

The  cumulative  RANDOM  CER  statistics  Indicate  again  that  the  LAR 

A 2 

technique  Is  most  appropriate  for  cumulative  data.  The  value  ^ Is 
greater  by  more  than  a factor  of  ten  than  dg2.  The  cumulative  R2  values 
continue  to  show  Inflation  due  to  auto-correlatlon. 

As  noted  In  the  previous  two  cost  category  presentations,  the 
cumulative  total  CER  shows  a better  fit  than  the  LAR  CER  which  leads 
to  the  same  conclusion  as  before.  That  Is,  if  one  uses  cumulative 
data,  then  the  cumulative  total  cost  technique  Is  the  most  appropriate. 

The  RANDOM  CER  estimated  with  the  marginal  data  does  not  Indicate 
the  superiority  of  either  the  LAR  or  TT  techniques,  since  the  components 
of  variance  are  nearly  equal. 
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Thus,  for  the  labor  cost  element  the  marginal  RANDOM  technique 
provides  the  best  CER  estimate  considering  the  data  utilized  In  this 
study. 

Recurring  Material  Dollars 

The  results  In  this  section  very  closely  parallel  the  results  for 

the  tooling  and  labor  CER's  presented  in  the  previous  two  sections. 

The  same  data  was  used.  The  most  significant  coefficients  In  all  the 

CER's  are  weight  and  quantity.  The  cumulative  RANDOM  statistics  and 
2 

the  R values  again  point  to  the  cumulative  total  CER  as  tie  most  appro- 
priate cumulative  type  CER.  The  marginal  RANDOM  statistics,  o^2  and 

A 9 

Og  are  of  near  eq  lal  magnitude  and  therefore  indicate  the  marginal 
RANDOM  CER  is  call' d for  in  this  case. 

These  results,  as  before,  show  that  the  marginal  RANDOM  technique 
Is  more  appropriate  for  estimating  material  cost  CER's. 

Comparisons  Overall 

We  know  that  the  cumulative  data  approach  Is  not  correct  here. 

There  is  auto-correlation  present  in  the  data  and  this  affects  the 
est  rated  CER's  greatly.  These  CER's  were  presented,  then,  cor  illus- 
trative purposes  and  to  point  out  some  of  the  properties  exhibited  by 
the  RANDOM  CER's.  This  study  does  not  recommend  the  cumulative  cost 
approach  for  any  airframe  cost  estimation. 

In  all  cases  the  cumulative  CER  statistics  seem  superior  to  the 
respective  marginal  statistics.  This  Is  the  result  of  the  auto-correla- 
tion problem  previously  mentioned. 

A 2 

The  cumulative  RANDOM  CER's  show  very  small  value  for  relative 
to  Oy2.  This  again  ran  be  a result  of  auto-correlated  d«ta.  The 
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marginal  components  are  quite  close  to  one  another  In  magnitude  In  nearly 
all  cases.  The  cumulative  RANDOM  CER  components  of  variance  Indicate 
one  other  Interesting  fact,  also,  o,2  Is  small  In  comparison  to  O 2 

Su2 

due  to  auto-correlatlon,  which  In  turn  causes  ne  value  p • — - — ■ ■ 

V * 56 

to  be  large.  Now,  If  p Is  very  close  to  one,  the  large 
V matrix  becomes  very  nearly  singular.  If  V Is  singular,  then  the  LAR 
or  cumulative  tote  1 cost  technique  Is  Indicated.  All  this  points  out 
the  lnapproprlateiiess  of  the  cumulative  data  approach  used  by  TT  and 
Indicates  a different  use  of  the  data  (marginal). 

The  marginal  CER's,  the  only  legitimate  data  approach  used  here, 
show  very  consistent  quantities.  The  variance  components  associated 
with  the  RANDOM  CER's  tend  to  be  much  closer  to  one  another  in  magnitude 
In  every  case  except  engineering  hours.  The  nearly  equal  magnitudes  for 
the  component  values  In  the  other  three  categories  does  not  favor  either 
the  LAR  or  TT  technique.  One  also  can  see  that  In  every  category 
(excluding  engineering  again)  the  R2  is  better  for  the  RANDOM  CER  as 
compared  to  both  the  marginal  TT  CER  and  the  LAR  CER.  These  two  factors, 
then.  Indicate  the  marginal  RANDOM  technique  Is  most  appropriate. 

One  final  comparison  left  to  be  considered  in  this  chapter  Is  be* 
tween  the  cumulative  total  cos*~  (CTC)  approach  and  the  LAR  approach. 

The  CTC  approach  was  designed  to  eliminate  the  necessity  for  estimating 
observations  as  LAR  did.  As  one  can  see,  the  results  show  that  the 
cumulative  total  cost  approach  CER's  are  superior. 
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' FzadlcLluu  Analysis 

This  chapter  will  examine  tha  predictive  qualities  of  all  the  CER's 
developed.  Zhe  cost  to  be  predicted  will  be  that  of  the  F-14,  lota  one 
and  two  only.  The  comparison  of  predicted  versus  actual  costs  will  be 
presented  In  Appendix  0. 

F-14  Explanatory  Variables 

The  only  variable  that  presents  any  problem  here  Is  the  speed  of 
the  M4.  The  speed  Is  classified,  but  this  paper  can  utilize  any  speed 
value  published  In  any  public  record.  The  most  recent  published  maxi* 
mum  speed  for  the  F-14  was  given  as  Mach  2.24.  The  method  used  to  esti- 
mate this  value  In  knots  was  by  comparison  to  other  similar  aircraft 
with  unclassified  values  given. 

F-105  ........... .......  Mach  2.10 1195  knots 

F-4  Mach  2.27  — —1220  knots 

Interpolation  between  these  two  aircraft  speed  values  yields  an  approxi- 
mation or  estimate  of  the  maximum  speed  at  best  altitude  for  the  F-14 
of  1215  knots.  Obviously,  this  estimate  could  cause  prediction  error  but 
as  was  noted  for  the  CER's  presented,  the  speed  explanatory  variable  was 
the  least  significant  In  all  cases.  This  should  minimize  the  effect  of 
a poor  speed  estimate  and  result  in  acceptable  predictions. 

For  each  cost  element  there  are  three  sets  of  predictions  calculated. 
Set  one  Includes  the  predictions  for  the  first  lot  of  F-14's  using  the 
RANDOM  and  Tims on  and  Tihansky  (TT)  techniques  for  both  the  cumulative 
and  marginal  cost  approaches.  Set  two  Includes  the  cumulative  and 
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■arg Inal  predictions  Cor  the  second  lot  of  F-14's  and  for  an  additional 
comparison  the  RANDOM  prediction  will  be  In  two  stepsi  (1)  the  unad- 
justed prediction  and  (2)  the  adjusted  prediction  (discussed  later  In 
this  chapter).  The  third  set  of  predictions  will  be  the  cuaailatlve 
RANDOM  and  IT  technique  based  on  the  prediction  of  cost  at  airframe 
quantity  equal  to  100.  These  values  will  be  compared  with  the  large, 
et  al.  (LAR)  based  prediction.  The  reader  Is  reminded  that  a LAR  CER 
was  not  developed  using  marginal  cost  data.  Table  IX  will  summarise  the 
explanatory  values  to  be  utilised  for  predictions. 


Table  IX 

Explanatory  Variable  Values  used  for  Prediction 


explanatory  Variable 

Value 

S 

1215 

w 

26500 

9 

12 

Lot  one  ciaiulatlve 

38 

Lot  two  cusnilatlve 

5 

Lot  one  marginal 

24* 

Lot  two  marginal 

100 

Large  type  predictions 

*The  value  used  for  the  material  cost  element  predictions  was  23  due  to 
a significantly  different  learning  curve  slope  estimation. 

Jfifc  as  fmisiisng 


One  of  the  advantages  to  using  the  RANDOM  technique  is  that  previous 
Information  can  be  used  to  better  predict  the  cost  of  a second  or  third, 
etc.,  lot  of  airframes.  The  adjustment  factor  Is  essentially  some 
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function  of  the  variance  components  multiplied  by  a function  of  the 
residual  error  values  determined  from  earlier  (In  this  case,  lot  cost) 
predictions  and  the  observation  of  the  actual  values  as  they  occur. 

The  development  of  the  mathematical  technique  Is  a simple  extension 
of  the  generalised  least  squares  procedure  as  presented  by  Johnston 
(Ref  7:212-213).  The  model  Is 


Y - XB  ♦ U 


with  the  E(U)  - 0 and  E(UU')  - V.  The  problem  here  Is  to  predict  a 
single  value  of  the  dependent  variable  y0  given  the  vector  of  explana- 
tory variable  values  X0.  One  can  now  write 


y0  - *0’B  ♦ uc 


where  u_  is  the  true  but  unknown  value  of  the  disturbance.  Again,  as 


before,  assuming 


E(uq)  - 0 

E<u02>  - o02  - aR2 


E(un  U)  . 


E(Ui  uG) 
E(u2  uc) 


E<un  uo> 


where  W Is  the  n x 1 vector  of  covariances  of  the  prediction  disturbance 
with  the  vector  of  sample  disturbances. 


I r; 

I 


Now  Johnston  defines  a linear  predictor  as 


a A 

p - C'  Y 
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where  C is  a vector  of  n constants,  if  p is  to  be  a best  linear  unbiased 

A 

predictor,  then  one  must  choose  C to  minimize  the  predictor  variance 

op2  - E [<P  - y0>2]  (40) 

subject  to  E (p-y0)  • 0.  With  the  use  of  Lagrange  multipliers  Johnston 

A 

derives  C as 

C - V*"1  [i  - XiX'V*”^)"1  X'V*”1  ] W 

* V*-1  Xtt'V*"^)"1  X'V*‘1Y  (41) 


which  leads  to 


p - C' Y 


A a 

- Xq  B ♦ W'  V*"1  E 


.1  a 

where  E - Y - X'  B is  the  vector  of  GLS  residuals  and  X0'  B is  the 

unadjusted  prediction. 

In  this  paper  the  only  prediction  Involving  the  abov>  methods  Is 
the  RANDOM  technique  prediction  of  lot  two.  In  our  case,  for  lot  two. 


0*«.0  5y2  ♦ <$£2_ 


Xi*  ..JUai 


SwnswcMMM 


^ * i 
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therefore, 

p2  » Adjusted  Prediction  Lot  Two  (RANDOM) 

O * 

‘ X2*  B ♦ ( 4-2-,Ua  2 > * <«> 

°U  4a€ 

where  e represents  the  actual  cost  of  the  first  lot  minus  the  predicted 
i A 

cost  and  X2  B represents  the  unadjusted  prediction  for  lot  two.  Of 
course,  the  adjusted  prediction  of  lots  three  and  on  would  require  more 
complex  matrix  manipulation,  but  Eq  (42)  could  be  used  with  the  proper 
construction  of  U,  E,  and  V 

Prediction  Intervals 

Along  with  the  actual  point  estimates,  this  chapter  will  also 
present  prediction  intervals.  A user  of  a CER  who  wishes  to  predict  a 
future  observation  usually  wants  or  needs  to  make  a statement  regarding 
the  confidence  that  can  be  placed  in  the  prediction.  There  are  two 
connon  measures  of  this  type;  confidence  intervals  and  prediction 
intervals.  Confidence  Intervals  are  estimated  limits  within  which, 
with  some  specified  probability,  the  mean  of  the  distribution  of  all 
possible  observations  about  the  true  regression  line  lies.  Now,  predic- 
tion Intervals  are  the  estimated  limits  within  which,  with  some  proba- 
bility, the  value  of  a single  point  estimate  (future)  lies.  This  study 
will  use  only  the  prediction  interval  confidence  measure  since  there 
will  be  a comparison  of  the  prediction  capabilities  of  the  three 
different  types  of  CER's  (Ref  13:2). 

The  actual  equations  used  to  obtain  these  Intervals  are  developed 
in  most  statistical  texts.  Of  course,  the  prediction  Interval  formula 
for  the  RANDOM  prediction  is  different  from  the  other  two  prediction 
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formula  and  will  b«  discussed  latar.  Tha  TX  and  LAR  technique  baaad 
CEA's  must  us#  th«  following  aquation  for  tha  Interval  calculations  for 
all  pradlctlonst 

1 to/2,  n-k) 2  3«  /‘♦K.'WW1*.  CM) 

wi.ara  t^2  Is  tha  t statistic  corresponding  to  the  desired  conf  1- 

danca  level  (In  our  case  90%  confidence),  oR  Is  an  estimate  of  the 
standard  devlatlor  of  regression,  X0  Is  the  vector  of  explanatory  vari- 
able values  utilised  to  obtain  the  prediction,  and  X Is  the  observation 
set  of  explanatory  variable  values  used  for  regression.  The  above  value 
Is  then  added  to  and  subtracted  from  the  prediction  to  form  a prediction 
Interval. 

The  equation  for  calculating  the  Interval  for  RANDOM  lot  one  pre- 
dictions Is  very  similar  to  the  previous  one.  The  only  change  to 
Eq  (46)  Is  that  the  value  V*1  Is  placed  between  the  values,  X'Xl 

*!•  1 fa/2,  „.»>  7 1 • X0'«'V-lx)-‘xo  (47) 

where  V Is  the  matrix  constructed  of  the  variance  components  (see  page  14). 

Follow  ^>n  lot  prediction  Intervals  under  the  RANDOM  technique  are 
quite  different.  The  prediction  variance  changes  when  more  Information 

becomes  available.  For  example.  In  the  case  here,  the  prediction 

2 

variance,  op  , changes  after  the  observance  of  the  F-14  lot  one  cost. 

2 

Essentially,  the  op  must  be  re-estlmated  after  each  observation.  The 
equations  used  to  accomplish  this  re-estlmatlon  can  be  derived  along  the 
same  lines  as  Johnston's  development  (Ref  7 (212-213).  One  must 
realise,  though,  that  Johnston's  equations  for  an  updated  prediction 

A 

variance  utilise  an  updated  B vector.  For  example.  In  this  study  the 
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first  lot  results  (cost  observed)  are  added  to  the  data  sot  and  then  a 
new  B vector  could  be  estimated  along  with  a new  V*  matrix.  Since  this 

A 

thesis  did  not  update  the  B vector*  the  equations  from  Johnston  must  be 
modified. 

Again*  as  In  the  adjustment  of  the  F-14  lot  two  point  prediction* 

W and  V*  are  defined  as  In  Eqs  (43)  and  (44).  Now*  using  previous 
notation  and  further  defining 

y,  - actual  cost  of  lot  1 


Pj  - prediction  of  lot  1 cost 


£2*2 
°U  * <*€ 

X|  - vector  of  explanatory  variables  associated  with  lot  1 

^ 2 

the  moat  obvious  way  to  approach  the  development  of  the  op2  equation 
when  one  does  not  re-estimate  the  B vector  Is  to  begin  with  the  equa- 
tions resulting  from  Johnston's  development  concerning  the  adjustment 
of  the  lot  two  costs  (Eq  (45)).  The  equation  is 

A A 

P2  " X2B  ♦ p e (48] 


where 


and  it  was  developed  from  Johnston's  final  result  where  given  a vector 

A 

C the  best  linear  unbiased  predictor  is  defined  as 

a * . *”1 

p - X0B  ♦ W'V*  E 


Using  Eqs  (48)  and  (50)  and  the  definition  of  B and  e one  can  now  obtain 

f>  a 2 

an  equation  for  C that  does  not  use  an  updated  B vector  for  the  o 

p2 
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estimate  associated  with  our  lot  two  adjusted  prediction! 

B - (X'V"lX)"1  X'V"*Y  (13) 

e m error  associated  with  the  P-14  lot  one  prediction 
using  the  RANDOM  CER 

- 7i  - *1  B (51) 


Subatltuting  into  Eq  (48) 

P2  - (X2  - p XX)  (X’V^X)’1  XV1Y  tp„  (52) 


Now,  referring  to  Eqs  (50)  and  (52),  one  can  immediately  letermlne  a 

^ f 

new  C vector  for  the  case  here  with  the  original  V and  X matrices. 
It  can  be  defined  as  the  following  row  vector: 


C'  - [(X2  - p X2)'  (x'v^x)'1  x’v^y  j p ] 


(33) 


Remembering  from  £q  (40)  the  prediction  variance  is  defined  as 


Op2  - E [(P  - yG)2  ] (40) 

2 A 

subject  to  E (p-yQ)  ■ 0.  Johnston  minimizes  op  with  the  use  of  his  C 
vector  which  uses  the  updated  V*  and  X matrices  and  for  the  P-14  lot  two 
case  would  be 


ap2  m c v*c  ♦ oRz  - 2C  w (54) 

where  0R^  is  the  estimated  variance  due  to  regression.  Using  the  non- 
A 

updated  C vector  developed  here,  one  can  now  develop  an  equation  for 
A 2 

on  which  accounts  for  the  fact  that  in  this  thesis  we  did  not  re- 
"2 

estimate  the  coefficients,  B,  after  observing  the  F-14  lot  one  acquisi- 
tion cost. 


1 


t 

I 
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A A A 2 A 

C'V  C t Or  • 2 C'  H 

[(*2  - 0 XjJtt'V1*)"1*'  1 Og2  ] C 

A 2 d 2 „ A 2 

♦ Og  ♦ 9^  - 2p  0L/ 

(^  - p X1)(X'V“lX)"1(X2'  - p Xj') 
a 2 A 9 A o 

♦ Og*  ♦ 0€*  - p Og*  (55) 

Eq  (55)  Is  now  In  th«  propsr  fora  to  b«  used  In  this  thesis.  The  equation 

A 2 

for  the  prediction  variance  for  lot  one,  Op^  , Is  Eq  (55)  with  p • 0. 
Therefore,  the  lot  two  prediction  variance  will  be  significantly  smaller 
than  the  variance  associated  with  the  F-14  lot  one  prediction.  The  inter* 
val  should  decrease  dramatically.  This,  of  course,  gives  the  RANDOM 
technique  a great  advantage  over  the  others  when  predicting  follow-on 
lots  of  airframes. 

All  the  Intervals  would  be  symmetric  for  the  linear  or  exponential 
CER  forms  but  in  the  case  of  the  log-linear  form  used  in  this  paper  the 
Interval  is  skewed  so  that  the  interval  is  larger  on  the  high  end  than 
the  low  end.  As  was  mentioned  in  an  earlier  chapter,  this  is  more  like 
the  real  world  where  there  seems  to  be  more  cost  overruns  than  underruns. 

Prediction  Presentation 

Tables  X,  XI,  XII,  and  XIII  present  the  prediction  results  for 
engineering,  tooling,  labor  and  material  cost  elements,  respectively. 

The  costs  presented  are  in  thousands  of  hours/dollars  and  a percentage 
value  is  also  given  to  facilitate  interval  width  comparisons.  This  per- 
cent value  is  calculated  by  subtracting  the  lower  or  upper  Interval 
value  from  tha  point  estimate  (prediction)  and  then  dividing  this  value 
by  the  point  estimate.  The  value  of  this  resulting  calculation  is  then 
multiplied  by  100  to  obtain  the  percent  interval. 
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Table  X 

Engineering  Hour  Predictions 


cut  Type/Lot  # 

Lower 

Liait 

Percent 

Prediction 

Percent 

Upper 

Liait 

Cumulative  Cost  Data 

RANDOM  Lot  1 

6242 

-30.3 

8960 

43.5 

12860 

TT  Lot  1 

6434 

-29.7 

9147 

42.1 

13002 

RANDOM  Lot  2 
(Unadjusted) 

8373 

-29.8 

11919 

..2.4 

16968 

RANDOM  Lot  2 
(Adjusted) 

13526 

-16.6 

16210 

19.8 

19427 

TT  Lot  2 

8856 

-27.4 

12104 

37.7 

16790 

RANDOM  100  Airframes 

10679 

-29.5 

15147 

41.8 

21483 

TT  100  Airframes 

11423 

-26.4 

15523 

35.9 

2109  5 

LAR  100  Airframes 

8373 

-45.2 

15280 

82.5 

27885 

Mara Inal  Cost  Data 

RANDOM  Lot  1 

2708 

-69.4 

8837 

226.3 

28838 

TT  Lot  1 

2779 

-69.0 

8978 

223.0 

29002 

RANDOM  Lot  2 
(Unadjusted) 

1377 

-66.9 

4160 

202.1 

12568 

RANDOM  Lot  2 
(Adjusted) 

1440 

-66.4 

4282 

197.4 

12737 

TT  Lot  2 

1401 

-66.5 

4176 

197.9 

12442 

57 


Q0R/SM/76D-10 


Table  XI 

Tooling  Hour  Predictions 


CER  Type/Lot  1* 

Lower 

Lialt 

Percent 

Prediction 

Percent 

Upper 

Limit 

emulative  Cost 

Data 

RANDOM  Lot  1 

1973 

-60.9 

5050 

155.9 

12921 

TT  Lot  1 

2556 

-37.9 

4117 

61.1 

6631 

RANDOM  Lot  2 
(Unadjusted) 

2853 

-60.6 

7248 

154.0 

18412 

RANDOM  Lot  2 
(Adjusted) 

6394 

-22.0 

8197 

28.2 

10509 

TT  Lot  2 

4363 

-35.2 

6731 

54.3 

10383 

RANDOM  100  Airframes 

3879 

-60.5 

9816 

153.1 

24842 

TT  100  Airframes 

6711 

-34.0 

10169 

51.5 

15410 

LAR  100  Airframes 

6780 

-39.5 

11200 

65.2 

18501 

Mars Inal  Cost 

Data 

RANDOM  Lot  1 

1594 

-73.5 

602  0 

277.6 

22731 

TT  Lot  1 

1725 

-64,4 

4852 

181.2 

13645 

RANDOM  Lot  2 
(Unadjusted) 

886 

-72.2 

3192 

260.3 

11502 

RANDOM  Lot  2 
(Adjusted) 

1152 

-63.0 

3109 

169.8 

8388 

TT  Lot  2 

1085 

-61.8 

2838 

161.6 

7425 
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Table  XI 1 

Labor  Hour  Predictions 


CER  Type/Lot  # 

Lower 

Llsilt 

Percent 

Prediction 

Percent 

Upper 

Limit 

Cumulative  Cost  Data 

RANDOM  Lot  1 

392  8 

-44.0 

7013 

78.5 

12321 

TT  Lot  1 

4077 

-39.2 

6705 

64.4 

11026 

RANDOM  Lot  2 
(Unadjusted) 

7979 

-43.7 

14173 

77.6 

25176 

RANDOM  Lot  2 
(Adjusted) 

9338 

-15.3 

11030 

18.1 

13028 

TT  Lot  2 

8875 

-36.4 

13951 

57.2 

21930 

RANDOM  100  Alrfranes 

14439 

-43.6 

25583 

77.2 

45325 

TT  100  Alrfranes 

16725 

-35.2 

25806 

54.3 

39819 

LAR  100  Airframes 

12537 

-48.9 

24543 

95.8 

48048 

Marginal  Cost 

Data 

RANDOM  Lot  1 

4143 

-44.0 

7396 

78.5 

13203 

TT  Lot  1 

4258 

-39.6 

7048 

65.5 

11666 

RANDOM  Lot  2 
(Unadjusted) 

4957 

-42.5 

8614 

73.8 

14967 

RANDOM  Lot  2 
(Adjusted) 

4693 

-36.5 

7396 

57.6 

11654 

IT  Lot  2 

5273 

-37.1 

8389 

59.1 

13346 
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Table  XIII 

Material  Dollar  Predictions 


CER  Type/ Lot  # 

Lower 

Lialt  Percent 

Prediction 

Percent 

Upper 

Lialt 

Cumulative 

Cost  Data 

RANDOM  Lot  1 

14935  -51.3 

30674 

105.4 

63001 

TT  Lot  1 

19972 

-46.8 

37526 

87.9 

70511 

RANDOM  Lot  2 
(Unadjusted) 

36740 

-50.9 

74818 

103.6 

152361 

1 

RANDOM  Lot  2 
(Adjusted) 

L25174 

-22.0 

160531 

28.2 

205878 

TT  Lot  2 

50198 

-43.6 

89076 

77.4 

158065 

RANDOM  100  Airframes 

77982 

-50.7 

158145 

102.8 

320717 

TT  100  Airframes 

106185 

-42.3 

184034 

73.3 

318959 

LAR  100  Airframes 

69801 

-60.5 

176830 

153.3 

447968 

RANDOM  Lot  1 

Mara Inal  Cost 
18204  -45.3 

Data 

33294 

82.9 

60893 

TT  Lot  1 

20943 

1 

S' 

»-* 

e 

o 

35482 

69.4 

60117 

RANDOM  Lot  2 

(Unadjusted) 

28219 

-44.2 

50597 

79.3 

90723 

RANDOM  Lot  2 

56668 

-32.6 

84059 

48.3 

124690 

(Adjusted) 

TT  Lot  2 

32592 

-38.3 

52828 

62.1 

85628 

60 


GOR/SM/76D-10 


The  marginal  results  presented  are  the  CER  predictions  Multiplied 
by  the  appropriate  number  of  airfraaes  Included  In  the  lot.  Therefore, 
even  though  Marginal  refers  to  average  cost  per  airframe  oer  lot,  the 
tables  will  show  a total  lot  cost  for  better  comparison  qualities. 

A 90  percent  prediction  Interval  was  calculated  for  the  predic- 
tions utilizing  the  Student-t  value  of  1.699  for  the  non-normal lzed 
airframe  quantity  CER's  (degrees  of  freedom  - 29),  and  1.743  for  the 
LAR  CER's  (degrees  of  freedom  - 6).  Cumulative  total  CER  predictions 
and  prediction  intervals  were  not  calculated  here  because  of  a lack  of 
complete  F-14  cos'  data. 

The  average  percent  Intervals  under  the  marginal,  cumulative  and 
LAR  cost  approaches  over  all  cost  categories  are  listed  In  Table  XIV. 

The  Intervals  are  large  In  all  cases.  The  TT  prediction  intervals  are 
the  smallest  when  predicting  lot  one  costs  and  the  cost  cf  100  airframes. 
But,  the  RANDOM  type  prediction  Interval  decreases  greatly  when  predict- 
ing the  lot  two  cost.  Of  course,  this  occurs  because  the  RANDOM  interval 
equations  take  Into  account  the  fact  that  the  lot  one  cost  Is  known. 

This  decreased  interval  Is  comparable  or  better  than  the  TT  Interval  for 
lot  two  In  both  the  marginal  and  cumulative  cost  approaches. 

One  may  argue  here  that  the  TT  intervals  could  have  been  reduced  by 
re-estlmatlng  the  B coefficients  after  the  lot  one  cost  vas  known.  But, 
remember  that  this  was  not  done  In  the  RANDOM  case  as  explained  earlier. 
There  Is  no  reason  to  believe  that  had  the  coefficients  teen  re-estimated 
under  both  techniques,  the  comparison  would  not  have  shown  the  same  com- 
parison results.  This  is,  of  course,  because  the  RANDOM  technique 
assumes  there  are  two  components  of  variance  and  when  the  lot  one  cost 
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Table  XIV 

Average  Percent  Intervals 


CER  Type/Lot  # 

Lower 

Llalt 

Upper 

Llalt 

Cumulative  Cost  Data 

RANDOM  Lot  1 

-46.6 

95.8 

TT  Lot  1 

-38.4 

63.9 

RANDOM  Lot  2 
(Adjusted) 

-19.0 

23.6 

TT  Lot  2 

-35.7 

56.7 

Quant 1 tv  m 100 

RANDOM 

-46.1 

93.7 

TT 

-34.5 

53.8 

LAR 

-48.5 

99.2 

Mars Inal  Cost  Data 

RANDOM  Lot  1 

-58.1 

166.3 

TT  Lot  1 

-53.5 

134.8 

RANDOM  Lot  2 
(Adjusted) 

-49.6 

118.3 

TT  Lot  2 

-50.9 

120.2 
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2 

is  observed  one  is  sctually  assuming  that  oy  Is  being  observed  thereby 
reducing  the  total  variance  associated  with  the  lot  two  prediction. 

This  is  what  the  RANDOM  technique  with  the  adjustments  made  (or  the  lot 
two  actually  does  and  the  results  here  show  a better  average  90/.  pre- 
diction  Interval  for  lot  two  especially  under  the  cumulative  cost 
approach.  One  can  see  that  there  is  no  decisively  superior  technique 
in  the  marginal  lot  two  Intervals  presented.  This  may  be  a consequence 
of  the  possible  auto-correlation  problem  discussed  earlier  in  the  cumu- 
lative CER's.  The  marginal  data  approach  eliminates  this  problem  if  it 
is  present  and  ma  be  showing  more  realistic  CER  results.  This  also 
accounts  for  the  fact  that  in  all  cases  here  except  the  material  cost 
category  the  cumulative  Intervals  are  narrower  than  the  marginal 
intervals. 

These  results  are  consistent  with  the  hypotheses  presented  in 
Chapter  1.  That  is,  the  TT  technique  underestimates  the  variance  for 
predicting  the  cost  of  a new  type  airframe  (the  TT  lot  ore  Intervals  are 
smaller  than  RANDOM'S)  and  overestimates  the  variance  for  predicting  the 
cost  of  a follow-on  lot  of  airframes  (the  TT  lot  two  Intervals  are 
larger  than  RANDOM'S).  Also,  the  LAR  Intervals  are  the  largest  of  the 
three  Q - 100  alr/rame  predictions  as  was  hypothesized  in  Chapter  I. 

Before  the  reader  draws  any  conclusions  on  the  basis  of  the  previous 
chapter  and  this  one,  he  is  advised  to  refer  to  Appendix  D where  the 
point  predictions  themselves  are  compared  to  actual  cost  (this  is  privil- 
eged Information). 

The  reader  is  again  reminded  that  the  cumulative  results  are 
suspect  due  to  the  auto-correlation  that  is  very  likely  present. 
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VI.  Sunn  and  Conclusions 

The  RANDOM  modeling  technique  is  adaptable  to  airframe  coat  estima- 
tion with  the  use  of  the  unbiased,  maximum  likelihood  variance  component 
estimators  put  forth  by  Searle  and  developed  In  Appendix  A. 

The  RANDOM  OCR  coefficients  along  with  those  of  the  redeveloped 
Large,  et  al.  (LAR)  and  Timson  and  Tlhansky  (TT)  type  CER's  (all  presented 
in  Chapter  IV)  are  comparable  to  the  coefficients  estimated  In  past 
studies  using  the  same  explanatory  variables. 

The  cumulative  data  approach  looks  much  better  than  the  marginal 

approach  in  almost  all  comparisons  made  in  Chapters  IV  and  V and  Appendix 

D.  But,  this  approach  Is  Inappropriate.  The  cumulative  approach  has 

the  inherent  problem  of  auto-correlated  data.  This  factor  Is  known  to 
2 

Inflate  R values  and  affects  the  estimated  variance  component  values 
for  the  cumulative  RANDOM  CER's,  l.e.,  the  component  associated  with 
different  airframes  Is  much  larger  thar  the  other.  These  problems  and 
the  correlation  assumed  to  be  present  In  lot  data  of  like  airframes  all 
point  to  the  LAR  type  CER  as  being  the  most  applicable  when  compared  to 
the  TT  and  RANDOM  techniques  under  the  cumulative  data  approach. 

However,  the  LAR  type  CER's  are  not  appropriate  either  as  one 
notices  the  comparison  of  them  with  the  cumulative  total  cost  CER's. 

This  other  technique  Is  very  much  like  LAR's  except  that  the  total 
quantity  ever  produced  of  a particular  airframe  type  Is  Included  as  a 
variable  to  predict  total  cumulative  cost.  This  Indicates  that  even 
though  the  LAR  technique  eliminates  both  lot  correlation  and  auto- 
correlation problems,  there  still  exists  the  "observation"  estimation 
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factor.  This  la  why  the  cumulative  total  CER  R2'a  are  superior  to  the 
2 

LAS  R 's  in  all  cases.  The  addition  of  the  airframe  quantity  variable 
eliminated  the  need  for  learning  curve  estimation  and  resulted  in  a 
better  CER. 


F 

I 


Thus,  if  the  cumulative  data  approach  must  be  utilised,  the  most 
appropriate  technique  is  the  cumulative  total  cost.  However,  as  discussed 
in  Chapters  III  and  IV,  the  IAR  and  cumulative  total  type  techniques  are 
throwing  valuable  Information  away  that  Is  contained  in  the  separate  lot 


data.  This,  of  course,  brings  us  to  the  marginal  type  CER's.  The 
RANDOM  estimated  variance  components  presented  in  Chapter  IV  do  not 

point  to  either  the  TT  or  LAR  technique  as  being  more  appropriate.  A 

2 

more  general  technique  like  RANDOM  is  indicated.  The  R values  add  more 
weight  to  the  above  statement;  the  RANDOM  R2's  are  best  in  all  but  the 
engineering  marginal  type  CER's. 

The  F-14  prediction  results  presented  in  Chapter  V and  Appendix  D 
lead  to  similar  findings  from  the  comparison  of  the  marginal  and  cumu- 
lative CER's.  That  is,  the  emulative  prediction  Intervals  look  much 
tighter  than  the  marginal  ones  when  in  fact  they  are  Incorrect.  Also, 
the  cumulative  point  predictions  look  better  than  the  marginal  predic- 


tions. But,  again,  auto-correlation  is  the  problem  here.  It  causes  the 
cumulative  prediction  Intervals  to  look  smaller  than  they  really  are. 

The  point  prediction  comparisons  seen  to  favor  the  cumulative 
approach  also.  This  is  especially  true  of  the  lot  two  predictions. 
However,  one  must  remember  that  the  cumulative  lot  cwo  predictions  must 
be  adjusted  for  proper  comparisons  with  the  same  marginal  predictions. 
The  adjustment  does  enlarge  the  cumulative  lot  two  prediction  error  in 
most  cases. 
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These  prediction  results  do  indicate  that  our  hypothesea  are  gener- 
ally  correct.  The  marginal  lot  one  Intervals  for  the  RANDOM  predictions 
•re  larger  than  the  respective  TT  Intervals.  We  did  hypothesise  that  the 
TT  technique  underest laatea  the  prediction  variance  for  a new  type  air- 
frame.  However,  the  hypothesis  that  the  XT  technique  overestimates  the 
prediction  variance  of  follow-on  lots  Is  not  very  evident  in  the  marginal 
lot  two  Interval  results.  The  RANDOM  and  TT  intervals  are  very  nearly 
equal.  Also,  the  LAR  prediction  intervals  are  larger  than  necessary. 

The  point  prediction  results  are  not  explained  by  our  hypotheses. 

The  RANDOM  predictions  should  be  superior  especially  for  lot  two  costs. 
This  did  not  occur  in  this  study.  This  could  be  the  result  of  an  in- 
herent problem  in  the  RANDOM  lot  two  adjustment  process.  That  is,  when 
the  RANDOM  prediction  errors  are  of  opposite  signs  for  lots  one  and  two 
(unadjusted),  the  adjustment  process  actually  increases  the  lot  two 
prediction  error.  This  did  occur  in  two  of  the  four  cost  categories  in 
this  study.  This  caused  the  RANDOM  prediction  to  appear  inferior.  But, 
this  particular  problem  should  disappear  when  predicting  lots  three, 
four,  etc.  The  RANDOM  predictions  and  associated  intervals  should  be- 
come much  better  than  the  same  TT  type  results  as  the  number  of  observed 
lots  Increases.  One  must  also  realise  that  this  is  a very  small  test  of 
the  predictive  capabilities.  The  F-14  data  mav  be  inaccurate  or 
atypical.  A great  deal  more  analyses  In  this  area  must  be  accomplished. 

This  study  laid  the  foundation  for  a very  promising  "new”  technique 
for  estimating  airframe  acquisition  cost  models.  This  technique  is  much 
more  general  than  present  methods  and  this  characteristic  can  be  utilised 
to  determine  which  of  the  two  most  useu  airframe  cost  estimating 
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technique*.  Large's  or  Tins an  and  Tlhansky's,  la  sore  appropriate.  If 
the  LAR  or  TT  techniques  can  not  be  justified  by  an  analysis  of  the 
RANDOM  variance  components  estimated,  then  the  RANDOM  technique  itself 
may  be  appropriate.  Unfortunately,  the  prediction  results  were  Incon- 
sistent as  compared  to  the  CER  statistics.  This  Is  not  surprising  when 
one  uses  only  one  airframe  type  and  two  lots  as  a test  case. 

Even  with  some  contradictory  results  the  RANDOM  technique  Is  very 
promising.  This  is  especially  evident  in  the  CER  comparisons  contained 
In  Chapters  IV  and  V where  the  marginal  RANDOM  CER  consistently  proved 
to  be  the  most  appropriate.  There  are  obvious  shortcomings  in  the  TT 
and  LAR  techniques.  The  RANDOM  marginal  technique  is  able  to  account 
for  these  shortcomings  and  therefore  Is  an  Important  additional  tool  to 
be  used  In  airframe  cost  estlsiatlon.  One  could  not  conclude  from  this 
small  study  that  the  RANDOM  technique  is  the  most  appropriate  technique 
for  all  airframe  cost  estimation.  But,  it  does  Indicate  that  the  LAR 
and  TT  techniques  are  definitely  not  appropriate  in  all  cases.  This  new 
technique  deserves  the  interest  of  DoQ. 

New  Research 

What  new  directions  of  research  are  indicated  as  a result  of  this 
study?  The  most  obvious  "next  step"  would  be  to  conduct  a very  similar 
study  using  all  the  airframe  cost  data  available  and  then  determine  the 
best  explanatory  variables  to  use  for  each  CER  estimated  with  some  sort 
of  step-wise  procedure.  The  same  format  could  be  used  as  was  In  this 
thesis,  except  this  new  study  should  do  two  things  differently.  First, 
the  coefficients  should  be  re-estimated  after  observing  a previous 
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lot  coat.  This  should  result  In  better  prediction  capability  analysis 
for  all  techniques  utilized.  Second,  the  engineering  cost  element  data 
problem  should  be  Investigated.  There  may  be  a way  to  account  for  the 
heavy  non-recurring  cost  loading  in  the  first  lots.  The  marginal  RANDOM 
technique  proved  to  be  the  most  appropriate  In  many  comparisons  pre- 
sented In  this  thesis,  but  this  new  study  must  be  accomplished  to 
validate  these  findings. 

As  a separate  thesis  effort  or  part  of  the  above  mentioned  study, 

the  significance  level  of  the  estimated  variance  components  should  be 

addressed.  The  distributional  property  of  the  components  has  to  be 

discovered  and  If  that  Is  done,  then  one  could  test  the  hypotheses  that 

either  component  Is  sero  or  not.  If  5^  ■ 0,  then  use  the  TT  approach; 
a ? 

If  m 0,  use  tl  e LAR  approach;  and  If  neither  Is  equal  to  zero,  then 
the  RANDOM  technique  may  be  the  method  prescribed  to  estimate  airframe 
costs. 
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Appendix  A 

The  Development  of  the  Variance 
Component  Estimation  Procedure 

This  section  will  first  discuss  the  general  application  of  mixed 
models  and  the  variance  components  associated  with  them.  Then  a detailed 
development  of  the  variance  estimation  process  utilized  in  this  study 
will  be  presented. 

Mixed  F Ixed  Ef f ects/RanUom  Effects  Model 

Eisenhart  introduced  the  term  "mixed  model"  to  describe  the  models 
used  for  experiments  where  some  of  the  effects,  such  as  animal  effects, 
can  be  random  effects  while  other  effects,  such  as  different  treatments, 
are  regarded  as  fixed  (Ref  351-21).  The  method  of  estimation  of  the 
variance  components  Is  fully  understood  for  various  mixed  data  arrange- 
ments when  there  is  a high  degree  of  symmetry.  However,  for  a general 
non-orthogonal  design  there  are  difficulties  and  no  simple,  known,  method 
is  optimal  under  all  conditions  (Ref  15:767). 

The  lack  of  balance  In  a data  set  is  a problem  encountered  often  In 
the  real  world.  This  lack  of  balance  causes  the  data  to  be  non-orthogonal. 
In  this  study  the  data  is  in  fact  unbalanced,  i.e.,  there  are  different 
numbers  of  lots  associated  with  each  type  airframe.  This  non-orthogon- 
allty  of  the  lot  data  necessitates  a very  specialized  method  to  estimate 
the  variance  components. 

Cunningham  and  Henderson  proposed  a general  method  for  estimating 
the  variance  components  for  an  unbalanced  situation  with  Iterative 
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calculations  (Ref  2 : 13-25).  But  Thompson  discovered  an  algebraic  over- 
sight on  their  part  which  caused  the  Iterative  process  to  converge  to 
unreasonable  estimates  (Ref  15:767).  Thompson  corrected  this  error  and 
Searle  further  refined  the  technique  to  fit  the  situation  when  one  has 
a mixed  model  having  one  random  factor  (Ref  12:465-470). 

The  above  procedure  given  by  Searle  Is  tailor  made  for  the  RANDOM 
technique  described  in  this  study.  Of  course,  the  one  random  factor 
here  Is  the  type  airframe,  while  the  fixed  portion  of  the  model  Is  asso- 
ciated with  the  aircraft  characteristics. 

Development  of  the  Estimation  Procedure 

This  section  will  parallel  the  development  of  the  estimation  tech- 
nique as  presente<  by  Searle  (Ref  12:465-470). 

To  begin  with,  the  model  must  be  defined  as 

Y - XB  ♦ ZU  ♦ € (56) 

with  the  rank  of  X equal  to  r(r(X)«  r),  with  B representing  g > r fixed 
effects  and  U,  In  representing  the  random  effects,  contains  t effects  for 
one  random  factor  (airframe),  having  a variance  equal  to  . Z then  has 

full  column  rank,  t,  with  the  columns  summing  to  one.  Therefore, 

r(X  Z)  - r(X)  ♦ t - 1 - r v t - 1 (57) 

where  the  space  between  X and  Z indicates  a partitioned  matrix.  Also, 
as  one  can  see,  the  matrix  Z'Z  Is  diagonal  and  non-singular. 

Since  the  model  Is  mixed,  the  normal  estimation  procedure  here 
would  be  to  use  the  fitting  constants  method  (Henderson's  Method  3) 

(Ref  5:226-252),  where  the  sum  of  squares  of  error  (SSE)  and  the  sum 
of  squares  due  to  regression,  R(_),  are  defined  as  follows. 
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SSE  - Y'Y  - R(B,U) 
R(0|B)  - R(B,U)  - R(B) 


E(SSE)  - [N  - lr  ♦ t - 1)]  0^ 


Additionally,  from  the  fitting  constants  method 


e[r(U  |b)]  - Oy2  tr  [Z'Z  - Z'X(X'X)“  X'z] 


1 


t 0 r [r(X)  ♦ t - 1 - r(X)]  (61) 


Thus,  point  estimates  of  the  variance  components  are 


Y'Y  - R(B,U) 

N - r(X)  -tel 


R(U|B)  - a}  (t  - l) 
tr[z'Z  - Z'X(X’X)’  X'Z] 


Now,  a computational  difficulty  exists  In  the  preceding  formulation. 


That  Is 


R(B,U)  - Y'[X  Z] 


X'X  X'Z  i X 


i'X  Z'Z 


This  calculation  can  be  very  involved  due  to  the  large  s ze  of  Z.  The 
"absorption  process"  described  by  Searle  permits  easier  calculation  as 
follows  (Ref  12:266-269):  With 


R(U)  - Y'Z(Z'Z)  Z'Y 


Mu.  .**&&»■**- 
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on*  finds  that 

R(B|U)  - R(U, U)  - R(U) 

simplifies  with  the  substitution  from  Eqs  (64)  and  (65)»  >o 


where 


and 


R(b|u)  - B°X'  [i  - Z(Z'Z)"1  Z']  Y 
B°  - Q“  X'  [i  - Z (Z'Z)-1  Z']  Y 

q - x'x  - x'z  (z'z)-1  z'x 


(66) 


(67) 

(68) 


(69) 


From  the  above  sinpilf ication  one  can  now  use  Eqs  (63)  and  (67)  to 
calculate 

R(B,U)  - R(B|u)  ♦ R(U)  (70) 

R(U|B)  - R(B|U)  * R(U)  - R(B)  (71) 

where 

R(B)  - Y,X(X'X)"  X'Y  (72) 

o 

The  crucial  result  derived  from  the  above  equations  is  that  B 
of  Eq  (68)  is  a solution  to 


” X'X  X'Z  ” 

~ B°  ~ 

” X'Y  “ 

- 

_ Z'X  Z'Z  _ 

_ U°  _ 

_ Z' Y _ 

which  actually  are  the  ordinary  least  squares  equations  for  B°  and  U° 
assuming  that  U is  a vector  of  fixed  rather  than  random  effects. 

For  one  to  develop  the  comparable  equations  for  obtaining  the 
maximum  likelihood  solutions  for  the  fixed  effects  requires  more  detail 
than  is  warrantee  here.  Searle  develops  the  theory  and  the  results 


i 
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trt  that  (Ref  12 >461-462) 


X'X  X’Z  B*  " |”  X*1 

Z'X  Z'Z  ♦ X lju  U*  J L 2,1 


where 


This  Is  where  Thompson  picked  up  an  error  in  the  presentation  by 
Cunningham  and  Henderson.  They  falsely  deduced  that  since  Eq  (73)  la 
the  same  as  Eq  (74)  except  for  the  quantity,  (Z'Z  ♦ X I),  replacing  Z'Z, 
one  could  make  this  replacement  throughout  the  entire  variance  component 
estimation  process  described  by  Eqs  (63)  through  (72).  The  result  Is 
an  Iterative  procedure  based  on  the  maximum  likelihood  equations  Implicit 
In  Eq  (74).  Therefore,  the  variance  component  estimating  equations 


become 


Y’Y  - R*(B.U) 

N - r(X)  -tel 


R*(u|b)  - (t-i) 

tr[z'Z  ♦ X I - Z'X(X'X)*  X'Z] 


The  comparable  R*(-) terms  are 


K*(B,U)  - R*(B|U)  ♦ R*(U) 

R*(U)  - Y'Z  P-1  Z'Y 

R*(B)  - R(B)  - Y'X(X'X)"  X'Y 


m 
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and 

% 

■ 

R*  (u|b)  - R*(B|U)  ♦ R*(U)  - R*(B)  (81) 

where 


P - Z’Z  » X I (82) 

Thompson , In  his  article  referenced  earlier,  pointed  out  that  the 
error  here  Is  the  false  assumption  that  the  expected  values  of  SSE*  and 
R*(U  |B)  are  the  same  as  those  of  SSE  and  R(u|b)  shown  In  Eqs  (60)  and 
(61).  This  Is  not  so  and  consequently  the  estimators  given  by  Eqs  (76) 
and  (77)  are  not  unbiased. 

Now  to  derive  maximimt  likelihood  unbiased  estimates  of  the  com- 
ponents  Thompson  modified  the  equations  by  noticing  first  that  frost 
Eq  (82) 

P’1** {22'  a„2  e a€2  1) 

- P-1(Z'Z  a02  ♦ o62  I)  Z' 

- p_l(Z'z  ♦ X I)  a02  z* 

- P*1  P Z*  Oy2 

- Ou2  (83) 

second,  from  the  model  (Eq  (36)), 

E(YY* ) - XBB*  X'  ♦ ZZ'  * 0£2  I (84) 

and  finally,  using  the  properties  of  the  trace  operator,  the  expected 
value  of  R*(U)  Is 

E [R*(U)]  - tr  [ZP*1  Z'XBB'X'  ♦ ZZ'  Oy2]  (85) 
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Next,  defining  Che  variable  T 


T - I - ZP_1Z' 


(86) 


Eq  (83)  results  in 


T(ZZ*  Oy2  ♦ 0€2  I)  - a€2  1 

so  that  now 


(87) 


E [R*(B|u)]  - tr  [TXBB'X'  ♦ TX(X,TX)"X'  0€2]  (88) 

and 

t [R*(B)]  - tr  [XBB'X’  ♦ X(X'X)"X' (ZZ'  Oy2  ♦ a €2  1)] 

(89) 


Now,  from  Eq  (81)  and  using  Eqs  (85),  (88),  and  (89) 


E [R*(U|B)] 


- Oy2  tr  [Z'Z  - Z,X(X'X)"X'Z] 

♦ o€2  tr  [X'TX(X'TX)”  - X'X(X'X)"] 


(90) 


But,  the  rank  of  XTX'  equals  the  rank  of  X and 


tr  [X*X(X'X)“]  - r(X) 


asking  the  last  tern  of  Eq  (90)  equal  to  zero  so  that 


(91) 


t [r*(u|b)]  - Oy2  tr  [Z'Z  - Z'XU’X)-  X'Z]  (92) 

Also,  from  the  above  equations  one  can  derive 

E [y'Y  - R*(B,U)] 

- E [Y'Y  - R*(U)  - R*(b|u)] 

- [N  - r(X)]  0€2  (93) 

where  N la  the  total  rnsiber  of  observations. 
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This  ail  results,  then.  In  an  Iterative  procedure  utilising 


k - 0^/0 


2/4  2 


to  obtain  the  unbiased,  maximum  likelihood  est lea tors  of 


2 2 

Og  and  ou  using  the  equation 


4 2 

°€ 


Y'Y  - [r*(U)  ♦ R*(B|U)] 
N - r(X) 


and 


4 2 

0„ 


[Y'Y  - R*(B,U)] 
N - r 


R*(U)  ♦ R*(B|U)  - R*(B) 
tr  [Z'Z  - Z'X(X'X)*X'Z] 


(94) 


R*(U  |B) 


tr  [Z'Z  - Z'X(X'X)‘X'Z] 


(95) 


The  variance  component  estimation  process,  therefore.  Is  accomplished 
by  taking  an  Initial  value  of  k,  calculating  P and  T,  which  leads  to  the 
calculation  of  the  necessary  terms  R*(U),  R*(B|U),  etc.,  and  finally  the 
variance  components  themselves.  Then  X is  again  calculated  with  the  new 
variance  estimators  and  the  procedure  is  repeated  until  convergence. 


The  Equations  Summarized 

This  study  utilized  the  equations  developed  in  the  previous  section 
In  the  following  manner.  First  of  all  an  1 ilttal  estimate  of  the 
variance  components  was  obtained  by  using  Henderson's  Method  3.  The 
equations  programmed  using  the  OMNITAB  language  and  the  symbols  defined 


earlier  were 


T0  - Y'Y 


(96) 
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M - 

X'[l  - Z(Z'Z)*1  Z'] 

(97) 

Q - 

MX 

(98) 

B°  - 

q'my 

(99) 

R(B|u)  - 

b°'my 

(100) 

R(U)  - 

Y'Z(Z'Z)“l  Z* Y 

(101) 

R(B)  - 

Y'X(X'X)“  X'Y 

(102) 

R(B,U)  - 

R(b|u)  ♦ R(U) 

(103) 

and 

R(U|B)  - 

R(B,U)  - R(B) 

(1<*) 

c ■ 

tr  [Z’Z  - Z'X(X'X)"  X*Z] 

(105) 

Now  with 

the  above  values 

n 2 

°€  " 

calculated 

[T0  - R(B,U)] 

N - r - t ♦ 1 

(106) 

and 

A 2 

v - 

Cr(u|b)  - (t-1)  a62] 

c 

(107) 

The 

above  first  estimates  were  used  to  calculate  the  Initial 

value. 

k,  for  the  first  Iteration 

of  the  process  summarized  as  follows: 

2 2 
q€ 

A ■ 

A 2 

3U 

(108) 

P - 

Z'Z  ♦ \ l 

(109) 

T - 

I - ZP^Z* 

(110) 

r*(b|u)  - 

Y'  TX(X'  TX)~X*  TY 

(111) 

R*(U)  - 

Y'ZP“1Z'Y 

(112) 
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R*(B)  - 

R(B)  - Y,X(X'X)*X,y 

(113) 

R*(B,U)  - 

R*(b|u>  ♦ R*(U) 

(114) 

and 

R*(U  B)  - 

R* (B,U)  - R*(B) 

(115) 

The  estimates  then  are 

* 2 

[t0  - R*(B,U)] 

°€  " 

(116) 

N - r 

and  4 2 

R*(U|B) 

°U  “ 

r 

(117) 

With  the  new  estimates  of  o(  2 and  Oy2  a new  value  for  X Is  calculated 
and  Eqs  (109)  through  (117)  are  reaccompl ished  and  so  on  until  convergence. 
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